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ABSTRACT 

Norm-referenced test results reported by states and 
school districts and factors related to those scores were studied 
through mail and telephone surveys of 35 states and a nationally 
representative sample of 153 school districts to determine the degree 
to which "aUDove average" results were being reported. Part of the 
stimulus for this study came from the report by J. J. Cannell and the 
Friends of Education community group that brought the issue to 
national attention. Analyses support cannell 's general finding that 
it is more common for a state or district to obtain test results 
above the national average, but they also lead to some less 
spectacular conclusions. Evidence provides strong support for the 
conclusion that norms for grades 1 through 8 from the late 1970s or 
early 1980s were often easier than more recent norms. A substantial 
portion of Cannell 's "Lake Wobegon" effect may be due to the use of 
old norms. There is ample evidence that scores on norm-referenced 
tests have been rising for grades 1 through 8 in recent years, but 
evidence for an actual increase in achievement is equivocal. Making 
valid inferences about achievement from test scores has always been 
difficult, but it is complicated by the current demands of 
accountability and the use of standardized tests as its primary 
indicators. Seven tables and 19 figures are provider!. A 39-itein list 
of references is included. Seven appendices contain sample letter 
and data collection form for state testing program ectors; an 
interview guide; a table indicating the number of d. .ricts available 
by cells in the sampling design; sample letters, data collection 
forms, and questionnaires sent to districts; tables indicating the 
district subsample for telephone interviews; grades tested by 
districts returning data; and stem-and-leaf distributions of district 
students scoring above the national median in reading and 
mathematics . (SLD) 
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Introduction 



It has become commonplace for a state or district to report that Its students 
are "scoring above the national average." Indeed, it has been suggested that all 50 
states and most districts are reporting above average achievement test scores 
(Cannell, 1987). Is it really the case that all states claim that their students are 
performing above average on achievement tests? If so, how should such results be 
Interpreted? 

These are two of several questions that motivated a study of (a) norm- 
referenced test results that are being reported by states and school districts and 
(b) factors related to those scores. This report presents part of the findings of that 
study. Published reports and results of mall and telephone surveys of states and a 
nationally representative sample of school districts were used to document the 
degree to which "above average" achievement test results are being presented. 
Analyses of the possible influence of the changing meaning of norms are also 
presented. Subsequent reports will address a number of other factors that may have 
an impact on the achievement test scores of states and districts and on the proper 
interpretation of those results. 



Background 

Standardized achievement tests have long been used by schools to report 
student achievement to parents, policy makers, and the general public. In recent 
years, however, the attention given to test scores has inr '•eased dramatically. Low- 
stakes testing programs with results returned to teachers and reported in a low-key 
fashion to school boards and interested parents have given way to high-stakes 
testing programs that have direct and important effects on students, teachers, and 
school administrators. The increased emphasis on the use of test results for purposes 
of accountability has made questions of test quality and the trustworthiness of 
Interpretations of major concern to educators and policy makers. 

A common, albeit not the only or necessanl) the best, way of providing the 
various audiences a means of interpreting test scores is to compare achievement test 
scores for a school building, a district, or a state to national norms. Slightly over half 
of the states and a substantial majority of the school districts rely on off-the-shelf, 
standardized achievement tests, for which normative comparisons provide a primary 
basis of interpretation. These comparisons take on a wide variety of forms, including 
the average grade equivalent score, the average normal curve equivalent score, the 
median percentile rank or percentile rank of the mean, the proportion of students 
scoring above the "national average," or more precisely, the national median, and 
the proportions of students with ^below average, average, or above average" scores 
where the three categories correspond to stanines 1 thru 3, 4 thru 6, and 7 thru 9, 
respectively. In each of these examples, national norms provide the primary basis 
of comparison- 

Norms, of course, are not the only basis of test score interpretation. Some 
states and districts rely on aiterion-referenced interpretations of either publlsher- 
or locally developed tests. In s'^ch cases, comparisons to past performance provide a 
key means of interpretation. loi example, trends in the proportion of students 
passing a minimum-competency test, the proportion of students mastering specific 
objectives, or the average number of objectives mastered provide a means of 
comparing the current year's achievement with a benchmark. Trends may also be 
important in the interpretarion of norm-referenced results, but the national norm 
still provides the major frame of reference for expressing the scores. Even states 
with locally developed or customized assessment programs sometimes also use 
comparisons to national norms to aid the interpretation of their achievement test 



results; these comparisons are obtained through special equating studies or Item 
response theory links. 

The pros and cons of normative comparisons have been discussed on many 
occasions. Discussions of appropriate and inappropriate normative interpretations 
are provided, for example, by Angoff (1971), Petersen. Kolen, and Hoover (1989), 
and in several introductory texts on educational and ps/chologlcal measurement. 
Good discussions of appropriate and inappropriate uses and interpretations of norms 
may also be found in the technical manuals and interpretive guides provided by the 
publishers of the major standardized achievement tests. 

Despite these discussions, normative interpretations continue to be misused 
and misinterpreted. The distinction that Angoff (1971) and others have made 
between the statistical meaning of "normative,* which refers to "performance as it 
exists," and the use of the term to refer to "standards or goals of performance" 
(p. 533) is too often overlooked. The fact that norms for school averages or district 
averages differ markedly from norms for individual students is too often Ignored or is 
given insufficient emphasis in interpretation. Because a school average is based on a 
range of student scores it necessarily falls somewhere in between the score of the 
highest scoring individual student and that of the lowest scoring student. 
Consequently, the distribution of school average scores is less variable than the 
distribution of individual student scores. The average achievement score that 
corresponds to the 70th percentile using school building norms, for example, may 
correspond to only the 60th percentile using nornis for individual students. 

It is widely believed that some tests have "easier" norms than others. If the 
norms of test A are easier or less stringent than those of test B, then a given level of 
achievement would be expected to appear better (e.g., result in a higher percentile 
rank or a larger proportion of students scoring above the national average) with test 
A than with test B. Note that the difficulty of norms is different than the intrinsic 
difficulty of test items. A test that asked easy questions could have hard norms 
because the normlng sample was unusually able in the content area of the ttst. 
Conversely, a second test that asked relatively more difficult questions could have 
easier norms because the normlng sample for the second test included a 
disproportionate number of low achieving students. The relative difficulty of norms 
for a particular school, sctiool district, or state may also depend on the degree to 
which the test content matches the curriculum at the building or classroom levels. 

The meaning cf norms depends fundamentally on the definition of the 
reference population, and secondarily on ihe adequacy of sampling, the level of 
participation, and the motivation of the students in the normlng sample, among 
other considerations. The year in whldi the norms were obtained is one of the 
important properties that define the reference population and it is clearly the case 
that norms become dated. If achievement is improving nationally, then the use of 
old norms will make a district or state appear to be doing better relative to the 
nation than would the use of current norms that provide a higher standard of 
comparison. 

Although the above concerns about the use of norms are hardly new, 
questions about the meaning and trustworthiness of normative comparisons that 
states and districts are using to communicate test results to policy makers and the 
public have recently taken on Increased Importance. The Inaeased Importance is 
due, in part, to escalation in the stakes Involved in testing. Concerns about 
normative comparisons were also exacerbated by the publication of a report by Dr. 
John J. Cannell (1987) dtled "Nationally Normed Elementary Achievement Testing 
in America's Public Schools: How All Fifty States Are Above Average." 
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The Cannell report Is based on a survey conducted by a community group, 
the Friends of Education, which found that "no state scores below the publisher's 
'national norm' at the elementaiy level on any of the six major nationally normed, 
commercially available tests" (Cannell, 1987, p. 2, emphasis in original). Based on 
this finding, Cannell concluded that "standardized, nationally normed achievement 
tests give children, parents, school systems, legislatures, and the press inflated and 
misleading reports on achievement levels" (p. 2). 

Cannell was not the first to notice that states were reporting results that 
were above the national norm in greater numbers than would be expected based on 
past experience or common-sense notions of the likely relative standing of 
particular states. In 1984, the Southern Regional Education Board (SREB) reported 
that 9 of 1 1 SREB states with norm-referenced test results for elementary grades 
were at or above the national average (SREB, J984). Two years later, "[i]n June, 
1986, SREB first described this situat* in in which student achievement in nearly all 
states was reported to be at or above me national averages as the 'Lake Wobegon 
effect'— desaiptive of Garrison Keillor's mythical town where all children are above 
average" (Korcheck, 1988, p. 3). However, it was the Cannell report that placed the 
issue in the national limelight. 

The Cannell report attracted a good deal of attention in the press when it 
was released in the fall of 1987 and has been the focus of considerable debate and 
controversy among professional educators and measurement specialists ever since. 
There are undoubtedly a number of factors that helped focus attention on the 
findings. Dramatic statements regarding the findings such as those illustrated in the 
above quotes may be part of the reason. Interest in the report was probably 
enhanced also by the sharp criticisms of test publishers ("we believe inaccurate 
initial norms are the reason for high scores", p. 5, emphasis in original), of educators 
for the "intpgratlnn nf iinchanplng test que stions into the curriculum" (p. 5, 
emphasis in the original), of those responsible for reporting student achievement 
("no state publication honestly described norm-referenced testing," p. 6), of 
university and public educators serving as consultants to test publishers "who too 
often are mere sycophants, giving the commercial interests what they want" (p. 9), 
and of the U.S. Department of Education, "whose lack of knowledge of these tests 
constitutes nonfeasance" (p. 9, emphasis in original). 

Even without the dramatic language and sharp criticism, however, the 
Cannell report raises serious questions and issues. The percentage of students 
reported to be scoring above the national 50th percentile in a number of states 
seems to defy common sense. 

The Cannell report has been the focus of considerable discussion at national 
meetings and in professional Journals concerned with issues of educational 
achievement and measurement. It was a major topic, for example, at the 1988 and 
1989 Annual Assessment Conferences sponsored by the Educational Commission of 
the States. The report was featured along with six commentaries from test 
publishers and representatives of the U.S. Department of Education in the Summer 
1988 issue of Educational Measurement: Issues and Practice. The report also led the 
U.S. Department of Education to arrange a meeting involving Dr. Cannell, 
representatives of major test publishers, and selected academics to discuss the 
findings and their implications in February, 1988. 

Reviewers of the Cannell report (e.g., Drahozal & Frisbie, 1988; Koretz, 1988; 
Lenke & Keene, 1988; Phillips & Finn, 1988; Quails-Payne, 1988; Stonehill, 1988; 
Williams, 1988) identified a number of factors, some of which were also suggested by 
Cannell, that might contribute to the seemingly anomalous finding that all states are 
above the national average. The fact that norms become dated was probably the 
most frequently mentioned potential explanation. Differences in the rules for 
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exclusion of students from testing In normlng and In operational testing programs 
was also proposed as a possible explanation by several reviewers (e.g., Drahozal & 
Frlsble; Koretz; Lenke & Keene; Phillips & Finn). Other suggested partial 
explanations Included the possible effect of a closer match between the test and 
the local curriculum In operational testing programs than In normlng samples (e.g., 
Koretz; Lenke & Keene; Phillips & Finn), and the possibilities that poor security, 
familiarity with the specific content of tests that are reused year after year, or 
teaching the test may inflate scores (e.g., Drahozal & Frlsble; Koretz; Phillips & 
Finn). 

Reviewers (e.g., Drahozal & Frlsble, 1988; Koretz, 1988; Lenke & Keene, 
1988; Phillips & Finn, 1988; Williams, 1988) also identified several shortcomings of 
the Cannell study and interpretations. The failure to distinguish between group and 
individual student norms in interpretations, aggregation bias that results when the 
percentage of districts with average scores above the national median is used to 
make inferences about the percentage of students with scores above the national 
median, and the treatment of the percentage of students at the 4th stanine or 
above as if it were an indicator of the percentage of students above the national 
average are among the misleading analyses and interpretations that were identified. 

Despite these and other limitations, some reviewers concluded that Cannell's 
major findings are still probably correct. Stonehill (1988), for example, stated simply 
that "Cannell's evidence is compelling* (p. 23). Others were more circumspect. 
Koretz (1988), for example, noted that "Dr. Cannell's errors are to some extent 
beside the point.. .for they are not sufficient to call into question his basic 
conclusion" (p. 11), and Phillips and Finn (1988) stated that in the absence of 
"evidence to the contrary" they generally concurred with "the central finding of Dr. 
Cannell's report" (p. 10). 



Procedure 

The Cannell study provided part of the stimulus for the present study. 
Certainly the issues raised in that study are important ones that deserve to be 
Investigated in greater detail. Of particular concern were the issues of aggregation 
bias, the sampling of districts to obtain estimates for states without statewide testing 
programs that provide normative comparisons to the nation, and the type of 
Information obtained from districts. The Cannell study only asked districts whether 
their students were above or below the national average. More detailed district 
results would be more informative. Since the Cannell study did not include results 
for secondary schools, it was also important to expand the coverage to all 
elementary and secondary school grades. 

Our interest, however, was in more than simply obtaining estimates of the 
number of states or the proportion of districts that report achievement test results 
that are above the national median or that have average achievement above the 
national mean. Such statistics are of interest, but are apt to raise more questions 
than they answer. It is evident that we also need to better understand the wa/s in 
which states and districts are using normative comparisons, the validity of those 
comparisons, and the factors that influence the results and the validity of test scores 
and their interpretation. Therefore, the present study was designed to collect data 
not only about the achievement scores that were reported by states and distrirts, 
but on a variety of related issues, including the way in which test results were used 
(e.g., public reporting, grade retention, school incentives), when and why the uses 
were initiated, how and when the tests were adopted, and policies regarding test 
administration, test security and the preparation of students for taking tests. The 
present report, however, is focused on the test results and the possible influence of 
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changes In the stringency of norms over time. Other aspects of the project data are 
addressed elsewhere (e.g., Baker, 1989; Bursteln, 1989; Shepard, 1989). 

State Survey 

Two national mall and telephone surveys were conducted. In the first 
survey, a letter and a data collection form (see Appendix A) were mailed to the 
directors of testing In all states. As can be seen In the sample copy In Appendix A, 
the state testing directors were asked to provide test results In reading and 
mathematics for all grades (K through 12) for the three most recent academic years 
(1985-86. 1986-87. and 1987-88). 

States were asked to report the percentage of students scoring above the 
national 50th percentile statewide If the Information was available. When It was not 
available, the states were asked to report state means and standard deviations In 
reading and mathematics as well as the scores corresponding to the 25th, SOth and 
75th percentiles statewide. In addition to test score Information, the states were 
asked to provide the name, edition, and form of the test used at each grade; the 
year the test was first used in the state; the year it was normed; the month of 
administration; and the way the scores were routinely reported (e.g., percentage of 
students above the national median). The number of students enrolled, the number 
tested, and the number for whom scores were reported were also requested at each 
grade for each of the three years in question. 

Since much of the information we were seeking was already available in 
published reports, the state directors of testing were asked to send copies of reports 
containing the requested information. The reports served in place of completed 
data collection forms if the reports contained the necessary information. Since 
information about how scores are communicated to the public and how they are 
interpreted by the press was relevant to our interests, copies of press releases and 
newspaper articles about test results were requested. 

Following the mailings, state directors of testing were contacted by 
telephone to arrange telephone interviews. Detailed results of the telephone 
interviews are presented in other reports of study results (see Shepard, 1989). 
hence only a brief description of the interview is presented here. 

A copy of the telephone interview guide is shown in Appendix B. In 
addition to clarification questions about testing data requested on the data colleaion 
forms, testing directors were asked questions about test use, test selection, the 
alignment of curriculum with the test, about time spent on teaching tested 
objectives, about objectives given less time as a result of the test, about guidelines 
for test preparation, about typical and extreme practices in preparing students to 
take tests, and about test security practices and experience. 

D' itrict Survey 

A stratified random sample of districts designed to be representative of the 
fifty states was selected. The 1980 c'-sus data were used to stratify school districts 
by region, size, and socio-economic status (SES). The definitions of the levels of 
three stratification variables are provided in Table 1. As can be seen in Table I, the 
three stratification variables, region, size, and SES, had four, eight, and five levels, 
respectively. Thus a total of 160 cells were defined. The SES index, which is 
defined in Table I, was used to rank the scliool districts and then to define five strata 
such that approximately 15% of the students were In each of the two extreme strata 
Oow and high), approximately 20% were in each of the adjacent strata (above and 
below average), and approximately 30% were in the average stratum. 
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Five districts were randomly selected for each cell where a sufficient numoer 
of districts was available according to the 1980 census. Five districts were available 
and selected for most cells; however, IS of the cells were void and 39 of the cells 
had fewer than five districts. For example, there were no high SES districts with 
enrollments of 100,000 or more in the North/Central region and there was only one 
low SES district with an enrollment of 100,000 or more in the Fast region. 



Table 1 

Definitions of Stratification Variables Used to Sample School Districts 



A. REGION. Region of the country was defined to havc' i strata. 

1. East. 

Connecticut, Delaware, District of Columbia, Maine, Maryland, 
Massachusetts, New Hampshire, New Jersey, Kcw York, Pennsylvania, 
Rhode Island, Vermont 

2. North/Central 

Illinois, Indiana, Iowa, Kansas, Michigan, Minnesota, Missouri, Nebraska, 
North Dakota, Ohio, South Dakota, Wisconsin 

3. South 

Alabama, Arkansas, Florida, Georgia, Kentucky. Louisiana, Mississippi, 
North Carolina, South Carolina, Tennessee, Virginia, West Virginia 

4. West 

Alaska, Arizona, California, Colorado, Hawaii, Idaho, Montana. Nevada, 
New Mexico, Oklahoma, Oregon, Texas, Utah, Washington, Wyoming 

B. SIZE. District enrollment, 1980 Census, 8 strata. 

1. Less than 1,200 5. 10,000 to 24,999 

2. 1,200 to 2,499 6. 25,000 to 49,999 

3. 2,500 to 4,999 7. 50,000 to 99,999 

4. 5,000 to 9,999 8. 100,000 or more 

C. SES. Community socio-economic status index based on the 1980 census. 
SES equals the median family income in thousands of dollars plus 6 times the 
median years of education of the population 25 years old or older. iES used 
to define 5 strata. The labels of the strata and approximate percentage of 
students in each are: 

1. Low (15%) 

2. Below Average (20%) 

3. Average (30%) 

4. Above Average (20%) 

5. High (15%) 




The first of the randomly-ordered districts In each of the 145 non-void cells 
was selected for Inclusion in the survey. Because achievement test results of large 
school districts have been the focus of considerable attention In recent years, we 
were p<«rtlcularly Interested In obtaining better Information about the achievement 
test results being reported by larger districts. Therefore, districts with enrollments of 
50,000 or more were oversariipled. With the oversampUng of large districts, a total 
of 175 districts were selected for the sample. Appendix C lists the number of 
districts selected per cell. 

After districts were selected, telephone calls were made to confirm that the 
district was still operating (had not, for example, been consolidated with another 
district since the 1980 census), to Identify appropriate respondents who were 
responsible for the district testing program, and to obtain complete mailing 
addresses. Where a district no longer existed, the second listed district in the 
corresponding cell of the sampling design was selected as a replacement. Once 
addresses were obtained, letters (see Appendix D) and data collection forms were 
mailed. 

A subsample of the districts was identified for telephone Interviews, which 
were conducted following the mall survey (see Appendix E for a description of the 
procedures used to Identify the Interview subsample). Because telephone 
Interviews were conducted with a subsample of the districts, two different letters 
requesting participation and two different data collection forms were sent to districts 
(see Appendix D). The same basic test data that were requested from states were 
also requested for all districts. Districts In the mall-survey-only subsample were also 
sent a brief questionnaire covering some of the Interview questions about the use of 
test results and perceived effects of testing In the district (see Appendix D). 
Districts In the Interview subsample did not receive a questionnaire, but were asked 
questions shown In the Interview guide in the telephone survey (Appendix D). 

Follow-up letters were sent to districts approximately three weeks and again 
six weeks after the initial mailing. If no response was received v/lthin three weeks 
after the second follow-up, attempts were made to reach respondents by telephone 
and urge them to respond to the survey. When district personnel declined to 
participate in the survey or could not be reached after repeated telephone 
attempts, the reason for the non -participation was recorded, and a substitute district 
was selected from the appropriate cell In the sampling design. 



Results 

States with Norm-Referenced Comparisons 

A total of 35 states provided results that allowed norm-referenced 
comparisons for one or more grades In at least one of the three years for which data 
were collected (1985-86, 1986-87, and 1987 88). The remaining 15 states did not use 
tests with national norms. The 35 states for which norm-referenced comparisons 
were obtained are listed In Table 2 with an indication of the basis for the comparison 
and the grade levels for which test results were reported. The basis for comparisons 
to national norms for states that administered an off-the-shelf, norm-referenced test 
is obvious. However, in order to obtain estimates of the percentage of students 
scoring above the national median or the percentile rank of the state mean or 
median test score, it was sometimes necessary to convert scores from the form in 
which they were reported. For example, if the state reported mean grade- 
equivalent scc es, those scores were converted to the corresponding percentile rank 
by reference to the test publisher's norms tables for individual pupils. 



32 



Table 2 

States with Norm-Referenced Comparisons and 
Grades Where at Least One Comparison Is Available 



Grades 



State 


Basis of 
Comparison* 


1 
1 


2 


6 


4 
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11 19 
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Alaska 
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PaHfnrnla 


LINK 
























Colorado 


NRT 
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NRT 




-f 




















Hawaii 
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Marvland 


NRT 
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Missouri 
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Nevada 


NRT 
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^ ^ ^ A Aw A A A A^«#A A A K ^ 


NRT 
























New Mexico 


NRT 
























North Carolina 


NRT 
























North Dakota 


NRT 
























Oklahoma 


NRT 














+ 






+ 




Oregon 


LINK 
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Rhode Island 


NRT 
























South Carolina 


NRT 
























South Dakota 


NRT 
























Tennessee 


NRT 
























Texas 


LINK 
























Utah 


NRT 
























Virginia 


NRT 
















+ 








Washington 


NRT 
























West Virginia 


NRT 
























Wisconsin 


NRT 
















+ 








Number of States: 


35 


10 


10 


20 


16 


13 


18 


13 


22 


11 


11 


13 5 



* NRT = Norm-Referenced Test LINK = Equated to NRT 
NRT/LINK = Some years based on NRT and others on LINK 
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Several of the states listed In Table 2 obtained normative comparisons 
indirectly by Unking non-normed tests or state assessment results to a norm- 
referenced test through the use of special equating studies or the inclusion of norm- 
referenced test items with known iteir. parameters in a customized test (see, for 
example, Yen, Green, & Burket, 1987, for a discussion of customized tests). States 
for which norm-referenced comparisons were obtained indirectly through such 
linkages are indicated in Table 2 by the word "LINK* in the column showing the 
basis of comparison. 

Although comparisons to national norms either directly or through an 
equating link could be obtained for a total of 35 states in all, the number of 
comparisons varied substantially by grade level. As can be seen in Table 2, the 
largest number of states with results for any single grade was 22 at Grade 8. Grade 3, 
with 20 states, and Grade 6, with 18 states, were used for statewide testing nearly as 
often as Grade 8. However, there was no grade for which normative comparisons 
were available for a majority of the SO states. Test results were reported by only 10 
or 11 states at Grades 1, 2, 9, and 10; only 5 states reported normative test results for 
Grade 12. 

Where possible, estimates of the percentage of students in a state who 
scored above the national median were obtained separately for each grade tested in 
reading and mathematics. Where estimates of the percentage of students above the 
national median could not be obtained, the state median percentile rank or the 
percentile rank corresponding to the statewide mean was used. Note that here, and 
throughout this report, it is the individual pupil norms, rather than norms for school 
buildings or school districts, that were used to determine percentile ranks. For some 
states, estimates of both the percentage of students above the national median and 
the median percentile rank or percentile rank of the statewide mean were available 
and used. 

The number of states and the number of students for which estimates of the 
percentage of students above the national median were obtained are reported in 
Table 3 by year of test administration, test content, and grade. Parallel numbers are 
reported in Table 4 for states where estimates of the median percentile rank or the 
percentile rank of the statewide mean were obtained. The latter numbers were also 
used to obtain weighted mean percentile rdnks for the states for which those results 
were obtained. In many cases the number of states and number of students in 
Tables 3 or 4 are the same for mathematics as for reading, because of the fact that 
both content areas were usually tested and a single number of students tested was 
reported for both tests. However, there are some differences (e.g.. Grade 8 in Table 
3), because results were available in reading but not mathematics for a given state. 

Percentage of students above national median. The combined results for 
states of the percentage of students scoring above the national median are 
summarized in Figure 1. The percentages shown in Figure 1 are weighted by the 
number of students tested in each grade for the states reporting data for each of the 
three years for which data were collected. Thus each bar in the figure represents 
the percentage of students in the states that provided data in this form who scored 
above the national median for a given school year and a given grade in either 
reading or mathematics. For example, the first column for Grade 1, 1985-86, is based 
on the 281,734 first-grade students in the 7 states (see Table 3) that reported test 
results in this form; it shows that 54% of those students scored above the national 
median in reading. 

The results in Figure 1 are consistent with the general results reported by 
Cannell (1987) in that the overall percentage of students above the national median 
was greater than 50 in all of the elementary grades in both reading and mathematics 
for each of the three years studied. The percentage above the national median was 
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Table 3 

Number of States and Number of Students Contributing to Estimates of 
Percentage of Students Above National Median by 
Year, Test Content, and Grade 



I. Reading 

1985-86 1986-87 1987-88 





Number 


Number 


Number 


Number 


Number 


Number 




of 


of 


of 


of 


of 


of 


oraue 


Mates 


Muoenis 


States 


Students 






1 


7 


281,734 


6 


271,954 


7 


302,544 


2 


8 


343,490 


7 


329,928 


7 


330,255 


3 


12 


362,239 


12 


302,893 


10 


461,152 


4 


14 


460,480 


13 


452,447 


13 


485,084 


5 


8 


242,871 


7 


209,289 


8 


226,122 


6 


10 


288,671 


10 


231,702 


11 


474,498 


7 


10 


381,570 


8 


283,334 


9 


337,862 


8 


13 


445,687 


16 


433,801 


13 


505,762 


9 


10 


250,712 


7 


244,762 


8 


351,102 


10 


8 


271,706 


10 


296,866 


8 


258,866 


11 


10 


250,712 


11 


239,223 


11 


241,956 


12 


3 


65,809 


3 


67,782 


2 


68,841 








II. Mathematics 






1 


7 


281,734 


6 


271,954 


7 


302,544 


2 


8 


343,490 


7 


329,928 


7 


330,255 


3 


11 


353,612 


11 


293,452 


9 


339,089 


4 


14 


460,480 


13 


452,447 


13 


485,084 


5 


8 


242,871 


7 


209,289 


8 


226,122 


6 


9 


280,053 


9 


222,886 


10 


364,093 


7 


10 


381,570 


8 


283,334 


9 


337,862 


8 


13 


445,687 


15 


424,959 


12 


396,574 


9 


-7 
« 


300,728 


7 


244,762 


8 


351,102 


10 


8 


271,706 


9 


287,457 


8 


258,866 


11 


10 


250,712 


11 


239,223 


11 


241,956 


12 


3 


65,809 


3 


67,782 


2 


68,841 



Table 4 

Number of States and Number of Students Contributing to Estimates of 
Percentile Rank of State Means or Medians by Year, Test Content, and Grade 



I, Reading 

1985-86 1986-87 1987-88 





Numoer 


iNJumDer 




i>iUIIlLICI 




Number 




of 


of 


of 


of 


of 


Of 


Grade 


States 


Students 


States 


Students 


Mates 


jiUucnis 


1 


S 


250,628 


5 


264,972 


6 


295,840 


2 


6 


308,342 


6 


323,318 


7 


385,391 


3 


11 


623,579 


12 


336,372 


12 


394,641 


4 


11 


389,954 


12 


446,642 


13 


509,839 


S 


7 


206,325 


8 


250,586 


11 


336,191 


6 


8 


526,312 


8 


245,215 


11 


391,526 


7 


8 


317,994 


8 


281,849 


11 


401,015 


8 


11 


403,406 


16 


471,619 


14 


468,180 


9 


6 


295,903 


6 


239,606 


8 


348,617 


10 


6 


236,868 


9 


291,311 


8 


253,699 


11 


9 


246,555 


10 


234,746 


10 


237,583 


12 


3 


Z7o,UoU 


2 


65,120 




68 841 








II, Mathematics 






1 


S 


250,628 


5 


264,972 


6 


295,840 


2 


6 


308,342 


6 


323,318 


7 


385,391 


3 


11 


623.579 


12 


336,372 


12 


394,641 


4 


11 


389,954 


12 


446,642 


13 


509,839 


5 


7 


206,325 


8 


250,586 


11 


336,191 


6 


8 


526,312 


8 


215,215 


11 


391,526 


7 


8 


317,994 


7 


244,332 


11 


401,015 


8 


11 


403,406 


16 


471,61" 


14 


468,180 


9 


6 


295,903 


6 


239,606 


8 


348,617 


10 


6 


236,868 


8 


253,671 


8 


258,722 


11 


9 


246,555 


10 


234,746 


10 


237,583 


12 


3 


276,030 


2 


65,120 


2 


68,841 
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Figure 1 

Percentage of Students Scoring Above National Median 
Based on States Reporting (Weighted by Number of Students) 
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usually greater for mathematics than for reading. Percentages were usually higher 
for elementary than secondary grade levels. For Grades 1 thru 6, the percentage of 
students scoring above th2 national median In mathematics ranged from a low of 
58% m Grade 4 for the 1985-86 school year to a high of 71% In Grade 2 for the 
1987-88 school year, whereas the corresponding range for reading was from 52% 
(Grade 5, 1985-86) to 60% (Grade 3, 1987-88). For Grades 7 through 12, the 
percentage of students scoring above the national median ranged from 49% (Grade 
12, 1985-86) to 60% (Grade 11, 1986-87) In mathematics and from 48% (Grade 9, 

1986- 87) to 55% (Grade 8, 1985-86) In reading. 

It should be noted that while the percentages displayed In Figure 1 are 
generally above the naive expectation of 50%, many Individual students were. In 
fact, receiving scores that were well below the national median. If a state reported 
that 55% of Its students had scores at or above the national median, for example, it 
Is obviously the case that the remaining 45% of the students in the state were 
receiving scores below the national median. 

The results in Figure 1 provide only a very global picture since they combine 
the data for varying numbers of states at each grade level. They do not, for 
example, provide an indication of the variability from state to state. Some sense of 
the variability can be obtained from Figures 2 and 3, which show the distributions of 
the percentage of students above the national median in reading and in 
mathematics, respectively. 

The data for the most recent year available for each state were used for the 
distributions in Figures 2 and 3, which for most states was the 1987-88 school year. 
Each point in Figures 2 and 3 represents the percentage of students in a state who 
scored above the national median in a particular grade. 

As can be seen In Figure 2, there is considerable variability from state to 
state. The tendency for the percentages to be greater than 50 is quite evident for 
the elementary grades. However, there are some cases where the percentage is 
substantially below 50. It should be noted that the point in Figure 2 that is most out 
of line with the Cannell (1987) results is the Grade 4 reading point that corresponds 
to a state where only 33% of the students were reported to have scored above the 
national median. This state introduced a statewide test in 1987-88 and hence was 
not included in the results reported by Cannell. 

The results shown in Figure 3 for mathematics show even greater state-to- 
state variability than was seen for reading. Consistent with the global results in 
Figure 1, the tendency for the percentages to be above 50 is more evident in 
mathematics than in reading. Some of the percentages in Figure 3 are 
extraordinarily high. Note, for example. Grade 2, where one state reported that 
86% of the students scored above the national median. The only tw. examples of a 
state where the percentage is below 50 for Grades 1 through 6— the 41% at Grade 4 
and the 49% at Grade 6— are both for the state that introduced statewide testing in 

1987- 88 and therefore was not included in Cannell's state-level data collection. 

Median percentile ranks or percentile rank of state means. Since the 
percentage of students scoring above the national median could not be 
estimated for all states, the median percentile ranks or percentile ranks of state 
means were also analyzed. Figures 4, 5, and 6, which parallel Figures 1, 2, and 3, 
respectively, display the results of the latter analyses. In general, the results 
using th3se percentile rank statistics are quite similar to the results using the 
percentage of students scoring above the national median. This is so despite the 
differences In the properties of the two statistics and the fact that the two sets 
of analyses are based on different, albeit overlapping,subsets of states. 
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Figure 2 

Percentage of Students Reported by States to be Scoring above 
National Median in Reading (Each Point Represents a State) 
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Figure 3 

Percentage of Students Reported by States to be Scoring above the 
National Median in Mathematics (Each Point Represents a State) 
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Figure 4 

Weighted Mean of SUte Percentile Ranks 
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Figure 5 

State Median Percentile Rank or Percentile Rank of 
State Mean Test Score In Reading (Each Point Represents a State) 
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Figure 6 

State Median Percentile Rank or Percentile Rank of 
State Mean Test Score in Mathematics (Each Point Represents a State) 
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The conclusions (a) that most states are reporting results above the national 
average, (b) that the dlsaepancy Is greater In mathematics than In reading, and (c) 
that the discrepancy is generally greater In the elementary grades than In the 
secondary grades do not depend on the use of a particular metric (e.g., the 
percentage of students above the national median). The same conclusions are 
supported by the use of the median percentile rank for each state or the percentile 
rank of the state mean. 

Nonnative Comparisons Based on District Results 

Data were obtained from 153 districts, or 87%, of the target of 175 districts. 
Appendix F provides a listing of the region, size, and SES of each of the 153 districts 
that returned questionnaires, provided reports on their testing programs, or 
completed telephone Interviews. Dlstrlctwlde norm-referenced test results were 
available for 148 of the 153 districts. For the remaining 5 districts, dlstrlctwlde 
normative comparisons could not be obtained for the reasons indicated in 
Appendix F (e.g., only criterion-referenced results were available). 

Also shown in Appendix F are the grades where norm-referenced test results 
were reported for each district. The grades where the largest number of districts 
reported norm-referenced test results are Grades 3, 4, 5, 6, and 8, in which test 
results were obtained for between 118 and 123 districts. As was shown in Table 2, 
those grades, with the exception of Grade 5, were also popular choices for statewide 
norm-referenced testing. 

As was done for states, estimates of the percentage of students in a district 
who scored above the national median were obtained for each grade tested in 
reading and in mathematics whenever possible. Where these estimates could not be 
obtained, the district median percentile rank or the percentile rank corresponding 
to the district mean was used. 

Estimates, based on the district data, of the percentage of students scoring 
above the national median in reading and mathematics for Grades 1 through 12 are 
plotted in Figure 7. The percentages plotted in Figure 7 are weighted by district 
size, region, and SES and thus are estimates of the percentage of students 
nationwide at a given grade that scored above the national median in reading or in 
mathematics. The number of districts on which these estimates are based varies by 
grade. The number of districts reporting data that could be used for the estimates in 
Figure 7 was 57, 77, 89, 87, 88, 85, 70, 84, 61, 52, 49, and 21 at Grades 1 through 12, 
respectively. 

As can be seen, the estimated percentage of students scoring above the 
national median is consistently above 50%. For Grades 1 through 6, at least 57% of 
the students are estimated to have scores above the national median in reading. For 
mathematics, at least 62% of students are estimated to be above the national median 
Grades 1 through 6. In Grades 9 thru 12 the estimates of 51% or 52% for reading are 
closer to 50%; however, with the exception of Grade 12 with an estimate of 54%, 
the percentage of students estimated to have scores above the national median in 
mathematics is 56% or higher in every grade. Although 56% is obviously greater 
than 50%, it is still the case tliat nearly half the students (44%) received score 
reports below the national median when 56% scored above the median. 

Figure 8 presents results that are parallel to those in Figure 7, that is, based 
on the data from districts where estimates of median percentile ranks or the 
percentile ranks of the district means were obtained. The weighted means of these 
percentile rank statistics are based on substantially fewer districts at each grade (the 
number of districts equaled 17, 27, 34, 29, 31, 27, 26, 29, 15, 16, 15, and 4 at Grades 1 
through 12, respectively). Nonetheless, the results in Figure 8 lead to conclusions 
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Figure 7 

Estimated Percentage of Students Scoring Above National Median 
Based on District Results Weighted by Region, District Size and SES 
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that are essentially the same as those based on the estimated percentage of students 
above the national median. With the exception of Grade 12, where the number of 
districts reporting data in this form was extremely small, all of the weighted means 
are greater than 50. The results for the elementary grades are higher than those for 
the upper grades and the results for mathematics are higher than those for reading. 

In addition to providing overall estimates of student performance levels, the 
district results provide a basis for investigating between-dlstria variability and 
characteristics of districts associated with level of performance. Estimates of the 
percentage of students who scored above the national median in reading and 
mathematics were obtained for a majority of the districts that returned test results. 
Distributions of these percentages for districts were inspected at each grade level in 
both content areas. Since the complete distributions for all grades are rather 
voluminous, distributions for only one grade are presented and discussed in detail. 
Summaries of the distributions for other grades are provided and complete 
distributions for Grades 1 through 1? are included in Appendix G. Grade 3 was 
chosen for Illustrative purposes since it was the earliest of the grades that were most 
frequently tested and reported by districts in the sample, 

A total of 123 districts reported norir -referenced test results for Grade 3. 
Eighty-nine of those districts provided data that could be used to estimate the 
peicentage of students scoring above the national median in reading and 
mathematics. The remaining districts reported data that could be used to obtain the 
median percentile rank or the percentile rank of the district mean, but did not 
provide a basis for obtaining the percentage of students scoring above the national 
median. 

Distributions of district percentages of students scoring above the national 
median are illustrated by the stem-and-leaf plots in Figure 9. The "stem" corresponds 
to the tens digit of the percentage of students in a particular district that scored 
above the national median. The "leaf reports the units digit for a district's 
percentage. The results for each district are depicted by a leaf (i.e., a single digit 
under the leaf column), that is associated with a particular stem which gives the tens 
digit for each leaf in that row. For example, one district reported that 93% of its 
students scored above the national median in reading and one district reported that 
94% of its students scored above the median. Those two districts are depicted in the 
upper-left-hand corner of Figure 9 by the 34 under the leaf column next to a stem 
of 9. The lowest percentage above the median for reading that was reported by a 
district was 15%. The results for that district are indicated by the leaf of 5 next to a 
stem of 1 toward the bottom of the stem-and-leaf diagram for reading. 

As can be seen in Figure 9, a majority of the districts reported that 50% or 
more of their students scored above the national median in reading (61 of 89 
districts) and mathematics (69 of 89 districts). Only 16 of the 89 districts reported 
that less than 40% of their students scored above the national median in reading, 
but there were 12 districts that reported that three-fourths or more of their students 
scored above the national median. In mathematics the results show even larger 
numbers of districts that reported a substantial majority of their students above the 
median. 

In order to summarize the distributions of district percentages of students 
reported to have scored above the national median, the 10th, 25th, 50th, 75th, and 
90th percentiles of the distributions were obtained. For Grade 3, those percentiles 
are reported at the bottom of the two columns of Figure 9. (Parallel results for the 
other grades are presented In Appendix G.) These figures indicate, for example, 
that 10% of the districts reported that 32% or fewer of their third-grade students 
scored above the national median in reading. On the other hand, the 90th 
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Figure 8 

Means of District Percentile Ranks Weighted by Region, District Size, and SES 
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Figure 9 

Stem-and-Leaf Distribution of the District Percentages of 
Students Scoring above the National Median at Grade 3 
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percentile of 78 indicates that 10% of the districts reported that over three-fourths 
of their third-grade students scored above the national median in reading. 

The five selected percentiles (10th, 2Sth, SOth, 7Sth. and 90th) of the 
district distributions of the percentage of students scoring above the national 
median were computed for all twelve grades. Those percentiles are shown in the 
box-and-whisker plots displayed in Figures 10 and 11 for reading and mathematics, 
respectively. Looking, for example, at the Grade 1 box-and-whisker plot for reading 
in Figure 10, it can be seen that the 10th percentile for the 57 districts reporting 
data at Grade 1 was 35, indicating that 1 district in 10 reported that 35% or less of its 
students scored above the national median. From the remaining percentiles for the 
Grade 1 reading results it can be seen that one district in four reported 45% or less 
of its students scored above the national median, half the districts reported 55% or 
less, three districts in four repor d 66% or less, and nine districts in ten reported 
81% or less. 

From an inspection of Figure 10, it can be seen that districts at the 50th 
percentile reported that more than half (54% to 58%) of their students scored 
above the national median in reading in Grades 1 thru 8. Only at Grade 10 did a 
district at the 50th percentile report that 'Mghtly less than half (48%) of its students 
scored above the national median iu reaumg. For the elementary grades, the 
tendency to have more than half of the si idents in a district scoring above the 
national median is much stronger in machematics (Figure 11) than in reading (Figure 
10). In Grades 1 thru 6, for example, the 25th percentile is equal to or above 50. In 
other words, three-quarters of the districts had more than half their students scoring 
above the median. Moreover, half the districts had 59% or more of their students 
above the narional median in mathematia for Grades 1 thru 8. 

The percentage of districts that had mors than half of their students scoring 
above the national median should not be interpreted as a direct indication of the 
percentage of students across districts who were scoring above the median. It would 
be possible, for example, for a substantial majority of districts to have more than half 
their students above the median while less than half of all students aaoss districts 
were above the median. Nonetheless, it is clear that It is more common for a district 
to report test results that are "above average" than ones that are "below average." 

The district results provide support for the general finding that it is more 
common to have students scoring above the national median than It is to have them 
scoring below the median. However, there are more exceptions to this rule, 
particularly in reading, than were suggested by the Cannell study, which reported 
that 169 of 188 districts were "above average." Five districts refused to provide the 
informarion and only 14 districts were classified as "below average" in the Cannell 
study. 

Cannell's results were based on a telephone survey of the largest districts in 
the sixteen states where statewide results were unavailable. Districts were "asked If 
their elementary (1-6) total battery scores were above, at, or below the national 
average" (Cannell, 1987, p. 22). A district was called above average if four of six 
grades were above the national norm, and scores on reading, language, and math 
were used in cases where total battery scores were unavailable. 

That the frequency of districts with scores below the median suggested by 
Figures 10 and 1 1 is greater than that suggested by the Cannell results is attributable 
largely to the difference in definitions. For example, one district that was classified 
as above average based on the Cannell study reported that for Grades 2 through 6 
the percentages of students scoring above the national median in reading during the 
1986-87 school year were 56, 47, 35, 44, and 48, respectively. While this district 
would appear to be "below average" based on these reading test results, it would 
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Figure 11 

Box-and-Whisker Plots Showing the Percentage of Students Reported to be 
Above the National Median in Mathematics by Grade for Districts at the 
10th 25th. 50th, 75th, and 90th Percentiles of the District Distributions 
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appear to be clearly "above average" based on the corresponding percentages for 
mathematics (64, 64, 54, 60, and 68, for Grades 2 through 6, respectively). In 
general, districts reported a larger percentage of students above the national median 
when using total battery or mathematics scores than when using reading scores. 

Summary of State and District Results 

Clearly it was the exception rather than the rule for a state to report that its 
students, particularly its elementary school students, were performing below the 
national average. Although it was somewhat more common for a district than a state 
to report that less than half of its students were scoring above the national median, 
a substantial majority of districts reported that their students were performing above 
average (i.e., more than 50% of the students were reported to be above the national 
median). The tendency for students to score above the national median was 
especially strong in mathematics for Grades 1 thru 8. Nonetheless, it should be 
noted that some districts reported that substantially less than 50% of their students 
scored above the national median. At Grade 3, for example, 1 district in 10 reported 
that a third or less of its students scored above the national median in reading. 



Achievement Trends and Dated Norms 

Although the state and district results are generally consistent with the 
Cannell and earlier SREB findings which reported that achievement test results are 
more often above than below the national norm, they provide no real indication of 
the reasons that led to this result. As was discussed earlier, a wide variety of factors 
have been suggested as possible explanations of the apparently high test results that 
are being reported by states and districts. General improvement in student 
achievement, at least at the elementary grades, is clearly one possibility. When 
there are upward trends in achievement, old norms are easier (i.e., they provide a 
lower standard of comparison) than new norms, and thus a state or district whose 
students score at the current national average would score above the average 
defined by dated norms. 

Using the aggregate results for districts, the district percentages of students 
scoring above the median in reading and in mathematics were related to the age of 
the norms used by districts at each grade (i.e., the number of years between the date 
of the test administration by a district and the date of the test nonning by the 
publisher). Table 5 lists the number of districts that provided information on the 
year that the norms in use were obtained and the percentage of students scoring 
above the median for Grades 1 through 12. Also shown in Table 5 are the mean age 
of tlie norms used by districts, the mean change in the percentage of students 
scoring above the median for each additional year since the norms were obtained, 
and the estimated mean change in the percentage that resulted from the use of old 
norms rather than current norms. 

As can be seen in Table 5, tb" average district that returned data was using 
norms that were four or five years r-; i. Although most districts were using the most 
recent norms available from the puUisher for the test being used, there was still an 
average of four or five years between the date of test administration by the district 
and the date of norming because publishers typically have collected norms only 
about every seven years. With a single exception, the percentage of students 
scoitng above the median increased in both reading and mathematics with each 
additional year since the norms were obtained. The exception was for reading at 
Grade 10. By using norms that were four or five years old rather than current 
norms, assuming the latter were available, the percentage of students scoring above 
the median was estimated to be higher in all but Grade 10 in reading and in every 
grade for mathematics. For Grades 1 through 8 the expected inaease ranges from 
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Table 5 

Changes In District Percentages of Students 
Above the National Median with Inaeaslng Age of Norms 



Number 
of 

Grade Districts 



Mean Age 
of Norms 
(Years)* 



Mean Change In 
Percentage Above 
Median per Year 



Estimated Mean 
Change (Old Minus 
Current Norms) 



Reading Math 



Reading Math 



1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 



46 
63 
73 
70 
73 
69 
61 
70 
49 
42 
42 
14 



4.7 
4.8 
5.1 
4.3 
5.2 
4.5 
4.8 
5.1 
4.7 
4.7 
5.0 
5.4 



1.3 
1.0 
1.2 
1.3 
1.4 
1.0 
0.5 
1.7 
0.5 
-0.3 
1.1 
0.2 



1.7 
1.9 
1.7 
1.4 
1.9 
2.3 
2.2 
2.2 
2.3 
1.1 
2.3 
1.2 



6 
5 
6 
6 
7 
5 
2 
9 
2 
-1 
6 



8 
9 
9 
6 

10 
10 
11 
11 
11 
5 

12 
6 



* Mean age of norms Is the average number of years between the date of test 
administration and the date that the norms used to report district results 
were collected by the publisher. 



2% to 9% In reading and from 6% to 11% In mathematics. Taking differences of the 
latter magnitude Into account would largely eliminate the tendency for these 
districts to report results that are above the national median. 

Trends over Several Years for Selected States 

The district results in Table 5 show that there is a relationship between the 
age of norms used and the level of achievement test scores for the districts in this 
sample. These results are aoss-sectional, and there may be a variety of other district 
characteristics associated with the age of norms for the test used as well as the level 
of student achievement. Therefore, these results do not provide a sufficient basis 
for concluding either that older norms are easier than newer norms or that 
achievement has been going up. 

Figures 1 and 4, which were considered earlier, present achievement test 
results for three years. Neither of these figures provides a very clear indication that 
achievement scores went up or down during the three years for which data were 
collected. There is some suggestion from both of these figures that scores went up 
in Grades 1, 2, and 3. However, the direction of cnange is not only unclear at most 
other grades, but would be difficult to Interpret in any event because the subset of 
states for which data were obtained changed somewhat from year to year. 
Furthermore, three years is too short a time interval to assess long-term trends. 

Though not a specific part of the data collection design, results included in 
the state assessment reports for some of the states made it possible to look at trends 
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for longer time Intervals. Achievement trends for four states are summarized In 
Figure 12. 

The upper-left-hand quadrant of Figure 12 shows a plot of the percentage of 
students In one state (State A) scoring above the national median In reading and 
mathematics at Grade 4 for each of the past six school years. During this Interval a 
single test form of a single edition of a test was administered each year and results 
were based on comparisons to the 1980-81 national norms provided by the test 
publisher. As can be seen, the first year the test was administered, 1982-83, the 
percentage of students scoring above the national median was well below 50 for 
both reading (41%) and mathematics (44%). During each of the following five years 
these percentages Inaeased, most notably In mathematics. In 1987-88, 57% of the 
students scored above the national median In reading and 68% scored above the 
national median In mathematics. 

Similar results using the alternative statistic of the percentile rank In the 
Individual pupil norms corresponding to the statewide mean test score are shown for 
another state (State B) In the upper-right-hand quadrant of Flgurj 12. As In the 
previous example, the results are shown for a six-year period du»lng which a single 
form of a s'ngle edition of a test was administered each year. Comparisons were to 
norms obtained In 1978 In this case. Although the trend for State B was less steep 
than the one for State A and was based on a different metric, there was a clear 
upward trend during the six years In both reading and mathematics. 

The third example, State C, shown In the lower-left-hand quadrant of Figure 
12, uses an entirely different metric than has been considered so far. The plots for 
State C show the percentage of students passing statewide minimum-competency 
tests In reading and mathematics for each of seven years. In mathematics the 
percentage passing was 95 In the first year and gradually Increased to 98% over time. 
For reading, where there was more room for movement, the Increases between the 
first and most recent years of test administration were more substantial. 

The final plot shown In the lower-right-hand quadrant of Figure 12 displays 
the percentile ranks of the state meuis In reading and mathematics based on 
Individual pupil norms for Grade 3 In State D. The State D results not only span the 
longest time Interval, twelve school years, but Include a change In test editions 
within the period of time that was covered. A single form of a single e'-ltlon of a 
test was used for the eight years starting In 1976-77 -^nd running through 1983-84. 
The pattern for those first eight years was reasonably similar to the ones shown for 
the other three states In Figure 10. There was a consistent upward trend during 
those years. 

The feature of the plot for State D that most clearly sets It apart from the 
plots for the other three states In Figure 12 Is the sharp decline shown In percentile 
rank between the 1983-84 and 1984-85 school years, followed by Increases over the 
next three years to bring 1987-88 results back to approximately where they were In 
1983-84. As was previously Indicated, during the 1984-85 school year the new 
edition of the test was Introduced and the same fomi of that edition was 
administered In each of the last four years covered In the plot of results for State D. 
Thus the sharp decline corresponds to the Introduction of the new test edition. 

The sharp decline In performance relative to national norms that State D 
experienced when the new edition of the test was Introduced Is not unique. 
Figures 13 and 14, for example, show the results for two large school districts that 
Introduced new editions during the 1987-88 school year. As can be seen, both 
districts experienced large declines In the percentage of students scoring above the 
national median between 1986-87 and 1987-88. 
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Figure 12 

Trends in Reading and Mathematics Achievement for Four States 
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There are several possible Interpretations of the trend results shown In 
Figures 12, 13, and 14. The most straightforward Interpretation of the trends In 
Figure 12 Is that achievement In reading and mathematla for the grades In question 
Improved rather steadily in all four states. The dip when a new edition was 
Introduced In State D could simply reflect general increases in student performance 
aaoss the nation, which made the more reant norms associated with the newer 
edition more stringent than the norms associated with the older edition of the test. 
This same Interpretation could also explain the dips In performance levels associated 
with a new test edition for the two districts shown In Figure^ 13 and 14. 

An alternative Interpretation of these results, however, Is that inaeases In 
test scores simply reflect Inaeaslng familiarity with a given test form and more 
focused Instruction on the content of that specific form. By administering the sam? 
fonr. of a test for several years teachers are apt to become Inaeaslngly familiar with 
the specifics of the test content and alter instructional emphases to better match 
the content of the test. As Indicated by Mehrens and Kamlnskl (1988) and by 
Shepaid (1989), test familiarity might Influence Instruction In a wide variety of wayi, 
ranging from practices that would generally be considered sound uses of test results 
(e.g.', Identifying and working on objectives where students show weaknesses) to 
those that most educators consider unethical (e.g., teaching the specific Items on a 
test just prior to test administration). 

It is not possible to c lstlngulsh whether the trends in Figures 12, 13, and 14 
were due to Improvements in achievement, to Increased familiarity with the tests, 
or to some alternative explanation, solely from the results presented in those 
figures. However, other data can be brought to bear on the issue. In particular, the 
questionnaire and interview results which are discussed in ot>ier reports based on 
this project (e.g., Shepard, 1989) speak to some of these Issu Only the question 
of whether norms are changing in difficulty \vith time as a result of Increases in 
Student achievement nationally will be considered here. 

Achievement Trends and Changes in the Difficulty oi Norms 

National changes in achiever-.ent levels obviously lead to diffeicnces in the 
meaning of norms. During a period of declining performance such as the nation 
experienced in the 1960s and the f'.rst part of the 1970s (Harnlschfeger & Wiley, 
1975.; Koretz, 1986; 1987'-, newer norms provide a less st'-ingent standard of 
comparison than older noii-ns. Koretz (1987), for example, estimated that during the 
period of the much pub'.idzea icSt score decline (roughly the early or mid 1960s to 
the mid 1970s) "the avfjrage decline in grades six and above was large enough that 
th° typical (median) s.udent at the enJ of the decline exhibited the same level of 
, * ivement as was shown before the decline by students at the 38th percentile" 
(p. 2). Thus a state u. district using old norms in the mid 1970s could have appeared 
to be well below the n.-tlonal average when in fact their students were scoring at 
the then current national average. On the other hand, when performance on 
achievement tests is inaeaslng, newer norms become harder and the use cf old 
nor:ns can make a state Oi district that would hava only average or below avc/age 
scores in terms of current national norms appe?«r to be above average. Clearly, 
nation il trends in achievement tests scores have Importance for understanding 
normf.tlve comparisons. 

Aithouj^h increases in test perform^^nce have not received as much attention 
as the decline of the 1960s ai:1 1970s, several sources of evidence suggest that 
achievement test scores have Deen going up. National Assessment of Educational 
Progress (NAEP) reports (e.g., Dossey, MuU's, Llndqulst, & Chambers, 1988; NAEP, 
1985) Indicated that there were some Incre^f.es in reading and mathematics between 
the early or mid 1970s to the mid 1980s. Based on his review of NAEP and data from 
several other tests, Koretz (1987) concluded that the decline in test scores ended 
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Figure 13 

Percentage of Students Above National Median for Dbtrict A 
Before and After a Change of Test Editions (New Edition In 1987-88) 
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Figure 14 

Percentage of 1 ludents Above National Median for District B 
Before and After a ':hange of Test Editions (New Edition in 1987-88) 
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with cohorts of students that entered school In the late 1960s and that subsequent 
cohorts of students "produced a sharp rise In scores on most, but not all, tests. In 
the majority of Instances In which scores increased, the rise has been steady—with 
each cohort tending to outscore the preceding one— and often roughly as fast as the 
decline* (p. 2). 

Norming studies conducted periodically for standardized tests also provide 
evidence regarding trends in national achievement. When a new edition of a 
standardized test is introduced, it is customary not only to collect new normative 
data for the new edition but also to equate the old and new editions of the test. 
The equatings make it possible to estimate the extent to which achievement has 
Increased or deaeased over the years between the norming of the two ediUons. In 
some cases, new norms are collected for a previously normed edition of a test, which 
again provides a means of comparing national performance on the test at two points 
in time. 

Several test publishers reported increases in achievement based on the 
results of their norming studies. CTB/McGraw-Hlll (1987), for example, noted when 
the norms for Form E of the California Achievement Tests (CAT) were reported and 
compared to the norms for the CAT Form C to which Form E was equated that "the 
CAT E norms are more difficult than the CAT C norms. This seems to indicate that 
students in 1984-85 were achieving at a higher level than in 1977, vhen CAT C was 
normed" (p. 3-4). Increases in performance were reported when ro:m G of the 
Iowa Tests of Basic Skills (ITBS) was published. "Between 1977-78 and 1984-85, the 
improvement in ITBS test performance more than made up for previous losses in 
most test areas. Composite achievement in 1984-85 was at an all time high in nearly 
all test areas" (Hieronymus & Hoover, 1986, p. 148). Inaeases in performance have 
also been reported for the Stanford Achievement Test (SAT7) (Wiser & Lenke, 1987) 
and the Comprehensive Tests of Basic Skills (CTBS) (Rothman, 1988) and increases 
can be inferred from comparisons of the norms for the Metropolitan Achievement 
Tests (MAT6) (Psychological Corporation. 1988) and nonns for equivalent scores on 
the previous edition of the MAT (Prescott, Balow, Hogan, & Farr, 1978; 1986). 

Table 6 provides a summary of the changes in the percentile rank of 
achievement test scores that were at the national median at one of the two times 
that norms were obtained for the six most used standardized achievement tests. The 
numbers are esrimates of the changes in national percentile rank In reading and 
mathemarics between the two norming years indicated at vhe head of each column 
of the table. Also shown for comparative purposes are estimated changes in national 
percentile ranks based on NAEP. 

As is indicated in the footnotes to Table 6, the numbers in each column of 
Table 6 are derived from different sources and involve different types of 
comparisons. In the case of the CTBS, the comparison is between 1981 norms and 
estimates of 1987 norms for the same test form based upon a weighting of user data. 
The Stanford results are based on 1981-82 and 1986 norming studies for the same 
test form. The other published test comparisons Involve norming studies for 
successive editions of the test battery. However, the numbers In Table 6 all have a 
similar interpretation. A positive number indicates that performance was higher 
when measured at the more recent of the two norming years indicated at the top of 
each column. For example, the number 14 shown for reading achievement on the 
California Achievement Tests (CAT) in Grade 2 indicates that an equated Form C or 
Form E score that would have placed a student at the national 50th percentile using 
the 1977 Form C norms would lead to a narional percentile rank of only 36 using the 
1984-85 Form E norms. The 14 shown in Table 6 is the difference between the 
percentile ranks of 50 in 1977 and 36 in 1984-85. 




Table 6 

Estimated Changes In National Percentile Rank of 
Achievement Scores at the National Median at One Point in Time 
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Footnotes for Table 6 



1 Differences in California Achievement Tests (CAT), Form E (1984-85 norms) 
percentile ranks and corresponding CAT, Form C (1977 norms) percentile 
ranks of 50 (CTB/McGraw-Hill, 1987, Table 38, p. 3-35). 

2 Differences in Comprehensive Tests of Basic Skills (CTBS), Form U percentile 
ranks in 1981 and those required to have a percentile rank of 50 on the CTBS 
in 1987 (based on November, 1988, CTB-McGraw-Hill press release, 
"CTB/McGraw-Hill Studies Show Students Achieving at Higher Levels in Basic 
Skills", see also, Rothman, 1988, p. 20). The 1987 norms are estimated from 
weighted user data. 

3 Differences in Iowa Tests of Basic Skills (ITBS), Form G (1984- 85 norms) 
percentile ranks and corresponding ITBS, Form 7 (1977-78 norms) percentile 
ranks of 50 (Hieronymus & Hoover, 1986, Table 6.31, p. 153). 

4 Differences in Metropolitan Achievement Tests (MAT6), Survey Forms L and 
M (1984-5 norms) and corresponding MAT, Forms J and K (1977-78 norms) 
percentile ranks of 50 (Psychological Corporation, 1988; Prescott, Balow, 
Hogaii, is Farr, 1978; 1986). 

5 Differences in SRA Achievement Series, Forms 1 and 2 (1983-84 norms) 
percentile ranks and corresponding SRA Achievement Series Forms 1 and 2 
(1978 norms) percentile ranks of 50 (Science Research Associates, 1979; 1986). 

6 Differences in Stanford 7 Plus (1986 norms) percentile ranks and 
corresponding Stanford Early School Achievement Test, 2nd edition; Stanford 
Achievement Test, 7th edition, and Stanford Test of Academic Skills (TASK), 
2nd edition (1981-82 norms) percentile ranks of 50 (Gardner, Madden, 
Rudner, Karlsen, Merwin, Callis, & Collins, 1983; 1987). 

7 Differences for the National Assessment of Educational Progress (NAEP) are 
based on age (9, 13, and 17) rather than grade (3, 7, and 11) cohorts. For 
reading, the differences are between the 1983-84 assessment percentile ranks 
and the corresponding 1974-74 assessment percentile rank of SO (NAEP, 
1985). For math, the differences are between the 1985-86 assessment 
percentile ranks and the corresponding 1977-78 percentile rank of 50 (NAEP, 
1988; frequency distributions provided by Beaton). 



With the exception of the SRA Achievement Series, the differences for 
Grades 1 thru 8 are all positive, indicating that more recent norms are more stringent 
than older norms for five of the six tests. For Grades 10 through 12 the differences 
aic generally smaller than those shown for the earlier grades and two of the four 
tests with results for the high school grades have some differences that are negative. 
Indicating a decline in performance and therefore easier recent norms in those 
instances. 

The changes in percentile ranks shown in Table 6 are based on various time 
intervals between norming studies. More direct comparison can be made by dividing 
the changes in percentile ranks in Table 5 by the number of years betweeii the 
norming studies to obtain estimates of yearly changes in percentile ranks. Such 
yearly changes in percentile ranks for Grades 1 thru 8 are presented graphically in 
Figures 15 and 16 for reading and mathematics, respectively. 

in general, the results in Figures 15 and 16 are fairly consistent with those 
based on the analyses of the district data that were reported in Table 5. The 
estimates of yearly changes derived from the district data are greater than those 
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Figure 15 

Estimated Yearly Changes in Reading Percentile Rank: 
Publisher Results'at the Median 
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Figure 16 

Estimated Yearly Changes in Mathematics Pe centile Rank: 
Publisher Results at the Median 
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shown In Figures 15 and 16 for some tests but smaller than those for other tests. 
The Table 5 estimates of changes in norm-referenced performance that would be 
expected as a result of a change in the date of the norms, however, are of the same 
order of magnitude as those shown in Figures 15 and 16. 

Although the NAEP trend results are based on age cohorts rather than grade 
cohorts, the NAEP results represent the best available Independent means of 
estimating national changes in achievement. Changes in percentile ranks estimated 
from NAEP results between the 1974-75 and 1983-84 assessments for reading and 
between 1977-78 and 1985-86 for mathematics are plotted in Figures 17, 18, and 19 
for 9-, 13-, and 17-year-olds, respectively. Also shown In these figures are the 
changes for the six norm-referenced tests at the modal grades for 9-, 13-, and 17- 
year-olds, that Is, Grades 3, 7, and 11. 

As can be seen In these figures, the different data sources vary a good deal In 
the magnitude of change In performance. The NAEP results suggest either some 
Inaease In performance (ages 9 and 17 In reading and ages 9 and 13 In mathematics) 
or no change during the Interval In question. The Inaeases Indicated by NAEP are 
smaller than those shown by some, but not all, of the standardized tests. Comparing 
the publisher Grade 3 results with NAEP age 9 results (Figure 17), It can be seen that 
four of the six standardized tests show larger gains In reading and five of the six 
show larger gains in matnematlcs than would be estimated by NAEP. At age 13 
(Figure 18) NAEP shows no change in reading and two of the standardized tests (SRA 
and Stanford) indicate only small changes at Grade 7, but the remaining four tests 
suggest more sub.«tantlal Inaeasei in performance. In mathematics, two standardized 
tests suggest smaller changes at Grade 7 than NAEP obtained for 13-year-olds, one 
standardized test shows a change similar to the one obtained by NAEP, and the 
remaining three standardized tests show larger gains in performance. At Grade 1 1 or 
age 17 (Figure 19), relatively little change is indicated by any of the data sources for 
reading and relatively small and inconsistent changes are indicated for mathematics. 

Of course, the dates of the first and second normlngs are not the same for all 
the tests and the tests differ in content coverage and in the specifics of the samples 
on which the norms were based. Nonetheless, the diffc ent data sources give rather 
different answers in some cases to the question of the degree to which test 
performance has Inaeased during the past decade. The discrepancy between 
increases suggested by NAEP and most of the standardized tests raises questions 
about the possibility that artifacts may Inflate the norm-referenced test results. 

One possible artifact is that the norms obtained for a standardized test may 
be biased because of differential participation rates in normlng studies by school 
districts according to whether the districts were already using the standardized test 
being normed (Baglln, 1981). If school districts that are already using a standardized 
test are more likely to participate in the normlng of a new edition of the test than 
districts using another publishers test, and if districts that are using a given test 
generally have curricula that match more closely the objectives of both the new and 
old editions of that test or emphasize those objectives because the test is used, then 
the norms could be more difficult. In other words, such an Influence would run 
counter to the observed tendency for states and districts to report that more than 
50% of their students score above the national median. 

To Investigate the latter possibility, Wiser and Lenke (1987) compared the 
performance of user and non-user groups when the 1986 norms for the Stanford 
were obtained. They found that "users performed as well or better than non-users 
in all subject areas through Grade 6." For Grades 7 through 12 the results were more 
mixed, with users performing better in some subject areas at some grades but non- 
users performing better for other combinations. 
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Figure 17 

Estimated Change at the Median in National Percentile Ranks of 
Achievement Test Scores at Grade 3 (NAEP, Age 9) 
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Figure 18 

Estimated Change at the Median in National Percentile Ranks 
Achievement Test Scores at Grade 7 (NAEP, Age 13) 
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Figure 19 

Estimated Change at the Median in National Percentile Ranks 
Achievement Test Scores at Grade 11 (NAEP. Age 17) 
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Wiser and Lenke noted that the comparison of particular interest in their 
results was between the 1986 non-users and the 1982 norming sample. Since the 
Stanford 7 was a new edition at the time of the 1982 norming, the participants In 
the norming sample had not previously used the edition and were comparable in 
that sense to the 1986 non-user sample. The 1982 sample and the 1986 non-user 
samples were also matched on school ability as measured by the Otis-Lennon School 
Ability Test. Thus, a comparison of the 1982 and 1986 non-user results provides an 
estimate of the change in achievement that is uncontaminated by the familiarity 
that users have with the particular edition of the test. 

We used the scaled score means and standard deviations reported by Wiser 
and Lenke (1987) to calculate two estimates of the changes in average test scores in 
terms of 1982 standard deviation units for total reading and total mathematics. The 
first estimate is simply the mean for the full 1986 norming sample (users and non- 
users) minus the 1982 mean, all divided by the 1982 standard deviation. The second 
estimate Is the 1986 mean for non-users only minus the 1982 mean, all divided by 
the 1982 standard deviation. The two sets of standardized differences are 
summarized In Table 7. 



Table 7 

Estimated Standardized Average Changes in Achievement Test Scores on the 
Stanford from 1982 to 1986 (Based on Wiser & Lenke, 1987) 



Reading Mathematics 





'"otal 


1986 


Total 


1986 


Grade 


Group* 


Non-users^ 


Group 


Non-users 


1 


.17 


.10 


.34 


c 


2 


.13 


.04 


.18 


.10 


3 


.13 


.12 


.15 


.12 


4 


.03 


-.01 


.12 


.12 


5 


.03 


-.02 


.17 


.16 


6 


.03 


-.02 


.10 


.06 


7 


.03d 


.03d 


.08 


.06 


8 


.OQd 


-.08^ 


.10 


.11 


9 


.08^ 


.03d 


.05 


.07 


10 


.05 


.05 


.04 


.03 


11 


.10 


.11 


.03 


.05 


12 


.13 


.14 


.05 


.08 



^The mean for the full 1986 norming sample (users and non-users) minus the 
1982 mean all divided by the 1982 standard deviation. 

^The mean for the 1986 non-users only minus the 1982 mean all divided by 
the 1982 standard deviation. 

^Not available. 

^Reading Comprehension. 
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For Grades 1 and 2 the non-user group data results in estimates of the gain In 
achievement In reading between 1982 and 1986 that are substantially smaller than 
the estimates based on the total normlng sample. The gain In reading achievement 
appears to be about 40% smaller (I.e., 100x(.17-.10)/.17) at Grade 1 and about 70% 
smaller at Grade 2 with non-user data than with the data from the total normlng 
sample. This difference Is consistent with the premise that familiarity with a test 
form leads to Inflated estimates of achievement gains. However, large differences In 
estimates based on non-user and total normlng sample data such as those for reading 
In Grades 1 and 2 are n^i found consistently. 

The non-user estimates of standardized gains In reading achievement are 
smaller for the total-normlng-group estimates In Grades 1 through 6 and Grades 8 and 
9, albeit by only a trivial amount at Grade 3. The two sets of estimates are the same 
to two decimal places In Grades 7 and 10, and the non-user estimates are actually 
larger than those based on the total normlng sample at Grades 11 and 12. For 
mathematics, non-user estimates of achievement gains are 20% or more lower than 
total group estimates only at Grades 2, 3, 6, and 7, while they are larger by an equal 
percentage or more at Grades 9, 11, and 12. 

Overall, the Wiser and Lenke results suggest that inaeasing familiarity with a 
particular test form may explain part of the apparent growth in norm-referenced 
test performance. The generally higher scores obtained by non-users in 1986 than 
were obtained in the 1982 normlng of the then new edition of the test, however, 
suggest that there also has been some more generalized improvement in 
performance, particularly in mathematics. 

Results recently reported by Hoover (1989) for the Iowa Tests of Basic Skills 
(ITBS) suggest that much of the inaease in performance on a test form may occur on 
the first operational administration of the form. From user data weighted to 
estimate national performance, Hoover estimated that approximately 55% of the 
students scored above the 1984-85 national median across Grades 3 through 8 on the 
Battery Composite when Forms G and H were first administered operationally in 
1985-86. In the second and third years of operational administration the average 
percentage of students across Grades 3 thru 8 who scored above the 1984-85 
national median increased to 59% (1986-87) and then to 60% (1987-88). 

The gains from the first year to the second and third years of operational use 
reported by Hoover may be attributable to a combination of real gains in 
achievement and inaeasing familiarity with a test form. The relatively large gain in 
the first year that the test was used operationally, however, may be due to a 
combination of several additional factors such a*; (a) the selection of a test that was 
most closely aligned with the state or district curriculum, (b) greater emphasis on the 
importance of good test performance when the test was used operationally than 
when it was normed, and (c) the exclusion of a larger fraction of less able students in 
operational test administrations than in normlng studies. Indirect support for the 
latter explanation comes from Hoover's finding that only about 6%, rather than the 
expected 10%, of the students scored below the 10th percentile during the first 
year of operational administration of Forms G and H of the ITBS. High scores (at or 
above the 90th percentile), on the other hand, occurred at the expected rate of 
10% in the first year of operational test use. 



Discussion 

Weighted estimates from the district sample suggest that at least 57% of the 
students in Grades 1 through 6 are obtaining scores above the national median on 
norm -referenced reading tests. The corresponding figure for mathematics is 62%. 
The comparable figures for Grades 7 through 12 are lower, but still somewhat greater 
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than 50%. The state results are quite consistent with the district estimates. Thus, 
the results of the present study provide additional support for the general finding 
by Cannell and by the SREB that for the elementary grades almost all states and the 
majority of districts are reporting norm-referenced achievement test results that are 
above the national median. 

While supporting Cannell's general finding that it is more common for a state 
or district to obtain test results that are "above the national average/ our analyses 
lead us to conclusions that are different, and certainly less sensational, than the 
ones he reached. To begin with, it is important to put the "above average* findings 
in context. Many students are receiving scores that are "below average* even in 
districts or states that are reporting that substantially more than 50% of their 
students are "scoring above the national average." When a district reports that 57% 
of its students obtained reading scores that are at or above the national median, for 
example, the other 43% of the students obviously scored below the median. It 
should also be emphasized that although most districts report results that are above 
the national average," there are still many districts throughout the nation that are 
reporting results that are below average. One out of 10 districts in our sample, for 
example, reported that a only about a third of its students at a given grade scored 
above the national median in reading. 

Cannell (1987) concluded that norm-referenced achievement tests are 
producing Inflated reports from states and districts on the achievement of their 
students. But the finding that more than half the students are scoring above the 
national median that was obtained when the norms were established does not 
necessarily imply that the results are Inflated. There are many factors tha* may lead 
to the general finding, but it seems clear that the use of "old" norms is one of the 
major factors that contributes to the abundance of "above average" scores. 

The evidence reviewed provides strong support for the conclusion that 
norms obtained for Grades 1 through 8 during the late 1970s or early 1980s are easier 
on most tests than are more recent norms. Consequently, a state or district where 
the average student scores at the current national average will be accurately 
reported to be above a national average that is defined by norms that are several 
years old. It appears that a substantial fraction of the "Uke Wobegon" phenomenon 
may be attributable to the use of old norms. It should be noted that the use of old 
norms is not purposeful on the part of school districts or states; they generally use 
the most recent norms available. Since standardized tests are usually normed every 
seven years, the most recent norms available will be, on average, 3.5 years old in 
most school years. 

Concerns about dated norms have led to suggestions that publishers should 
produce current annual norms (e.g, Cannell, 1988; Phillips and Finn, 1988) and 
publishers are now attempting to do this by obtaining weighted estimates of national 
results from user data (e.g., Rothman, 1988). As Shepard (1989) has pointed out, 
however, annual norms based on user data potentially have several serious defects. 
If users differ from nonusers in ways other than those reflected by the demographic 
variables used for weighting, then user-based annual norms may be worse than dated 
norms where there is at least an understood frame of reference. In particular, if test 
familiarity leads to higher test performance, a state or district that changes 
publishers and administers a several-years-old test form for the first time will be at a 
disadvantage when results are compared to user norms (Shepard, 1989). 

The alternative of conducting special national normlng studies every year, or 
even every other year, is not a realistic or desirable possibility. Normlng is not only 
expensive, bu the quality of the results is very dependent on voluntary 
participation of schools and well-motivated students. Current participation rates in 
normlng studies condurted roughly every six or seven years by a publisher are 
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already far lower than would be desired. More frequent attempts to norm tests 
would surely lower the participation rates itill further and thereby degrade the 
quality of the norms. Finally, it should be noted that although more recent norms 
provide a more stringent standard of comparison when scores are going up as they 
have been during the last decade, they would provide a less stringent standard 
during periods ot decline in scores such as that experienced between the mid 1960s 
and the mid 1970s. Thus, we do not believe that the use of annual norms is an 
appropriate or effective way to deal with problems caused by dated norms. 

In any reporting of :est scores emphasis needs to be given to the changing 
meaning of norms and the age of the norms that are used. It obviously is not 
sufficient to report that "students in state X are scoring above the national average- 
without clearly indicating the year in which the norms were obtained. Simply 
noting the year of the norms is not enough, however. An explanation of the 
implications of shifting norms also needs to be provided along with an indication of 
what is known about recent trends in the stringenrv of national norms. 

There is ample evidence that scores on norm-referenced tests have been 
going up in Grades 1 through 8 in recent years. But the more important question is: 
Has student achievement improved in recent years? Unfortunately, the answer to 
the latter question is equivocal. 

Achievement test scores are of interest to the degree that they enable valid 
inferences to be made about broader achievement domains. But little attention has 
been given to the issue of the degree to which valid generalizations about broad 
achievement domains can be made from state or district test results. 

Comparisons of the changes in norms of standardized tests with estimates of 
changes in achievement based on NAEP results suggest that test norms may be 
changing more rapidly than is student achievement as measured by NAEP. The 
Wiser and Lenke (1987) findings that apparent increases are generally smaller for 
non-users than for users of a given test series suggest that part of the apparent 
growth in achievement based on norm-referenced test results may be due to 
inaeased familiarity with a particular form of a test. Only part of the apparent gain 
can be explained in this way, however. 

The differences between the gains in performance indicated by NAEP and 
by norm-referenced tests, and between Wiser and Lenke's total norming sample and 
their non-users suggest at the very least that caution is needed in interpreting gains 
in norm-referenced test scores as reflections of the amount of improvement that has 
taken place in achievement, more broadly defined. More direct assessments of the 
degree of generalizability of results to other tests and to other indicators of student 
achievement are greatly needed. 

Hoover's (1989) finding that only about 6% of the students scored below the 
10th percentile in the first year of operational administration of Forms G and H of 
the ITBS suggests that roughly a third to a half of the difference between the 
percentage of students scoring above the national median and the naive 
expectation of 50% may occur in the first year of use and may be due to what 
happens with the least able students. This suggests that greater emphasis in 
reporting needs to be given to the lower end of the score distribution i nd to the 
students who are excluded from testing when results are reported by st ites or 
districts. It may be quite appropriate, indeed desirable, to exclude students with 
limited English proficiency or students receiving particular types of special 
education services from a norm-referenced test administration. Such students should 
not be ignored, however, when district or state achievement results are reported: 
At minimum, the number of such students and the reasons for exclusion from testing 
should be reported. 
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The practice of using a single form of a test year after year poses a logical 
threat to making Inferences about the larger domain of achievement. Scores may be 
raised by focusing narrowly on the test objeaives without Improving achievement 
across the broader domain that the test objectives are Intended to represent. Worse 
still, practice on nearly Identical or even the actual Items that appear on a test may 
be given. As Dyer aptly noted some years ago, "If you use the test exercises as an 
Instrument of teaching you destroy the usefulness of the test as an Instrument for 
measuring the effects of teaching" (1973, p. 89). 

Current accountability pressures place great emphasis on test scores. It Is 
unlikely that any single test, no matter how well constructed, normed, and 
validated, can withstand the pressures to serve both as an Instrument of Instruction 
and as an Instrument for measuring the effects of Instruction. Making valid 
Inferences about broad achievement domains from test scores has always been a 
challen',1ng and difficult undertaking, but It Is made all the harder by current 
demands for accountability and the use of standardized test results as primary 
indicators of accountability. 
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Appendix A 

Sample Letter and Data Collection Fomi for Directors of State Testing Programs 
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July 22, 1988 



Tirnrr ---^^'^ nF^xTOPj 




Dear htflt"" r.»i-vv^OT nr^K^npl: 



we ,.e. your .«lst.n=. in . atuay j^,r„f ?.::^^!",r^^^,r:n':eh"/:f 

th. Office of Nor^rt Element.ry J>chieven,ent Testing in 

5itr.ul.ted by the report ""i'"' «bove Average" by Dr. John 
teieric.'s Public School5. HO" considerable attention in the 

'p;esrr„";.s^'erorrr;« inte:nr.t 0..^ ..on, those concerned about 
the assessment of educational achievement. 

1 «re both orovocative and controversial. 

Canneirs findings 'f ""^ ti^rn^ HI called into question by 

C^nn^ri'mnrinrt t"-nTUrte Xe, belo- the publisher's .national ncr... 
Canneil s lina^ny u* ma-ior nationally normed, 

rorro:i!r."^liM:^:st:■^; < : E Ln^^n; cinrenn\=ion 

n nfl Pr;>rtlrc . 1988, Vol . 7. NO. 2). 

Given the importance that is attached to ."/^/^^^^^^^^^ 
widespread use of nor^tive comparisons ^annell^s J.nd.n.s^^^^ 

deserve close scrutiny. We achievement results reported 

magnitude and -«-^«"'=Vto h!ve T^tt^^ understanding of the factors which 
by Canncll. We also need to have « />«tter unae 
may contribute to and explain the findings. 

.V - we need vour help in collecting information that will 

TO -hieve these 90-l»j^ «e -^^-'^.^.P ,,,, proportion of 

provide a better data Dase lor o ^nlv what proportion of students score 

students score above determining not only what prop^ in,portant 

above the 50th P«--/J: -^^"f.^^ ucr hingeH; ^ans over tLe and c.e 
characteristics of the information on the way in which 

variability in scores. We also need oot retention, 
test results are -"^^^^h:: , ^^s ie re instituted! and planned changes 
irrhe isrof^t::; re::i;s"%iril., we are seeding information about 
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policies regarding test security and guidelines on preparation of students for 
taking testa. 

A CBESST staff member will be contacting you by phone to seek your assistance 
and to arrange for a time for a phone interview with an appropriate person on 
your staff. The information that will be requested is outlined on the 
enclosure. We will send you more detailed worksheets between now and the time 
of the telephone interview to help organize the requested information. 

in many cases, the information that we are seeking may be provided in reports 
that have previously been prepared. Thus we request that you send us copies 
of any reports that give summaries of district results that have been 
published within the past three years. Copies of press releases and newspaper 
«ti"es about the test results would also be useful. If you send us reports 
Ind press releases as quickly as possible, we will use the reports to extract 
as much of the requested information as possible. We will call you to ask 
questions after we have "done our homework". 

Please send reports to: Robert L. Linn 

School of Education 
Campus Box 249 
University of Colorado 
Boulder, CO 80309-0249 

Thank you for your consideration. We will phone ycu within the next two weeks 
to answer questions and to try to arrange a time for a telephone interview, A 
return postcard is enclosed so that you can indicate the name, phone r.ur^er, 
and best times for us to try to contact the appropriate person for che 
telephone interview. 

Sincerely, 



Eva L. Baker 
UCLA 

Co-Directors, Center for the Study 
Standards, and Student Testing 



Robert L, Linn 

U-*versity of Colorado-Boulder 
of Re aarch on Evaluation 



Explanation of Informaiion Requested 



fnlumn information — rCQUCStCtl 

1 Testing year 

2 Grade levels tested K • 12. 

3 Name of test used for statewide assessment e.g.. CTBS. MAT, name of 
locally developed test. 

4 Edition of the test used at each grade level, e.g.. 1982. 

5 Fonn of the test used at each grade level. 

6 Year when test was first used. 

7 Norming year of test used for reporting scores. 

8 Month in which tests were administered. 

9 Type of scores reported, e.g.. percent correct, percentile rank. NCE. 

n.b. If you have more than one type of score, please provide one form 
of data in the preferred order as follows: 

Percentile Rank 
Grade Equivalents 
NCE 

Sianines 
Percent Correct 

« • • 

10 Number of students enrolled: the toul number of students by yrudc 

statewide. 

\\ Number of students tested. 

12 Number of students* scores reported: If not all scores are used to 

compute rankings or other statewide test results, enter the number of 
students* scores used to compme the achievement data. 

p^^rfiny q{>: Thf studrnts scnrinp above ihr naiional SOih 

rrrrrniilr statewide. 

^ qt.' Thf V'-rr^ni of ^.nd^nt^ 5;coring above thf mmi\ 

prr^^rn'i^''- statewide. 

n b ]f n.;.h.r rcadin r n^^ ^-^^ ^^'^ rtQucs\r<\ in 1? jinQ 13 ?r(?^vail?Mg . picasc 
on ihr form. 
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If Ihc daia rcqucsicd in columns 13 or 14 (percent of students scoring above the 
Li^n\l 50^hTrcentile) .re not available. ple«e provide as much of the foUow.ng 
as possible (columns 15 - 20 on the Altemaie Information Sheet). 



Tftlumn 

15 Reading statewide mean. 

15 Reading statewide standard deviation. 

17 Math statewide mean. 

|g Math statewide standard deviation. 



19 



20 



Reading score at each percentile: The score at the 25th 
percentile statewide. 

- at the 50ih percentile statewide. 

. at the 75th percentile statewide. 

Math score at each percentile: The math score at the 25ih 
percentile siatewide. 

- at the 50th percentile statewide. 
. at the 75th percentile statewide. 



Type of scores: If the type of scores reported irt-tolumns 13-20 arc not 
the same as those indicated in column 9. please indicate the type of 
scores used to compute the percentiles, mean, and standard dcv.ai.ons. 



statewide Testing Information 



A.5 



State Name 



Person Supplying Information 



Title 



8 



9 



Testing Year 



Graida 



T«st Nama 



Edition Form 



Year First 
Used 



Norming 
Year 



Testing 
Dates 



Type of 
Scores 



198 5 •1986 



1986-1987 



K 



1987-1988 



198S-1986 



1986-1987 



1987-1988 



1965-1986 



1986-1987 



1987-1988 



1985-1986 



1986-1987 



1987-1988 



1985-1986 



1986- 1987 

1987- 1988 



1985-1986 



1986-1987 



1987-1988 



1985-1986 



1986-1987 



1987-1988 
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R«Ur to Explanttlon of Information Roquottod - Attached 



10 

Number~or Students 
Enrolled 



11 

Number of Students] 
Tested 



21 

Number of Students' 
Scores Reooned 



J_3 

Reading: % of Students 
«hov National 50%ite 



1 A 

Math: % of Students 
above National 50%ilff 
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Alternate Information Available 



Person Supplying Information 

State Name 


























1S85-1986 1 






















1986-1967 7 1 






















19a7»1988 








































1985*1966 1 




















1986-1987 8 


















1987-1988 








































1985-1986 ' 




















1986-1987 9 




















1987-1988 






































1985-1 986 
















1 V W *m t W W ^ 

1986-1987 1 0 
















1987-1988 




















































1985- 1986 

1986- 1967 1 1 


















1987-1988 








































1965-1986 




















1986-1987 1 2 








































1987-1988 
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code 



District 
State 



Interviewer 



date 



Person(s) Interviewed 



name 



name 



title 



title 



Background information: 



Number of schools in district 



Size (range) 



Center for the Study j 
Robert L. Linn, School 
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Pare I: District Testing Data (to be recorded on the forms providecO 



YEARS 1. Are districtwide test results available for: 
TESTED 

1987-83 



_1986-87 
1985-86 



If none, then the arast recent year: 



If there 
districts. 



is no districtwide testing, ask only 12. 13. 19 - 22. and 26 for large 



ENmL^ENT 2. What is the basis for the enrollment ^i^^J^^^ . ^ ^^^^^^^g^f^;^ '"^^ 
3ASIS nuraber of students in each grade? (e.g., ADA- Average waiiy 



Attendance) 



ENROLUOT 3. What office , provides the f Jsures? 
SOURCE [name of person and phone number if easily availaDiej 



TESTED - 
REPORTED 



A. Is the number of stud ents tested the same as the nuraber 
o\ studeni-s that are included in the reported test results? 

Yes 

If no. how does the number included in the reported test result* 
differ from the number tested? 



probe: special education 
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5. Were all eligible students in the grade tested or is a 
sampling plan used? 

universal testing by grade sampling plan 

Please describe any sampling procedures used. 



6. Vhat rules are used to determine students who are excluded 
from testing? 

request ; copies of any written policies that describe these rules 



7. How many students (or what percent of the students) are 
excluded using these rules? 



8, What are the policies for n»ake-up testing (for students who 
are absent)? 

request ; if in writing 



6;) 
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(Ask Che roliowing onlv if needed:] 

Q a soecaliv conscrucred test is used, is 't linked to a 
L.Auul V. ^- a spe^.axx. ^ and edition or the 
rr.-.-crjrrr^Ti norr'.-rererenced test, i- so, ^nac is tae "'^'■^e g .. 



COiiSTcCCTED norn-ref erenced cesc^ 
norm referenced test? 



■>-p-.3TT-r iO 'z tne percent of students above the 50th percentile is 

'otj?aL u;kn;:n !lease describe the way in which scores are reported 

^^^S li^Wrisons are made to the national norn. 



loru 11 Are any factors of schools or the characteristics of their 

OTS students taken i.;r.o account in reporting test scores? 

SCoJi^ (e.^., percent minority, percent eligible for free lunch. Chapter I) 
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r BEGIN' TAPE ?.£CORDi:iC ] 

Part II: Testing Policies and Percspciins 

USES AND 12. V;ha: are the uses o: tes: results? 

im?orta:<ce 

-local district and school • scructional and evaluation decisions 



-reporting to parents about individual student progress or school 
oroerams 



-School Board attention (And if so. how have Board nenbers used test 
results— to increase testing programs or other forns of 
accountability? 



•state or local politician use of scores in campaigning or proposing 
legislation 



-changing general funding levels for schools 



-targeted funds or mandating pro-^rams such as ranediation 



-superintendent, principal, or teacher performance ratinq or jobs 



-media coverage and comnuiiity awareii' ss 



«♦♦ llow important are test scores in /our district' 



/ _/ 



extremely ^^^y ^;3airately sU'.ntiy not important 
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REF0S:-1S 



13. Have major educational r^iforns been introducad in your 
district in the past five years? 

recuest: Would you briefly describe these or send us written 
•descriptions that are available? 



TEST 14. selected the standardized test(s) being used? (If locally 

SELECTION developed, how was the c. ent selected?) 

probe: conunittee composition, e.g.. teachers, parents.... 



CURRICULUM 15. Have there been efforts to assure that the curriculum and the 
ALIGi'iMENT test are aligned? 

If so. please describe those efforts. 
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^T-tc nv- 15, Do vou think that teachers spend more tine teaching the 

SPECIFIC specific objectives on the test(s) than they would if the testi 

OBJECTIVES were not required? 

How nuch more tine? 



IMPORTA>rr 
OBJECTIVES 



17. To what extent do you think important objectives are given 
less time or emphasis because they are not included on the 
GIVEN LESS TIME test? 

What kinds of objectives are neglected? 



INFOKIAL 
GUIDELINES 
ABOUT TEST 
PREPARATION 



18 Do you or members of your staff provide informal guidelines 
abiut test preparation? What kind of advice do you sive 
schools about how to prepare students to take 
tests? 

probes: 

length of time to practice 

minimum and maxiroun recommended time for practice 
whether to use items in a specific format for practice 
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TECHNICAL 19. '.vhac kind of technical assistance or materials do you 

ASSISTANCE provide to schools about test preparation? 

ABOUT TEST 
PREPARATION 

recuest: Would you send us copies of the materials or descriptions 
of the assistance? 

probes: 

practice tests 
testwiseness packages 

curriculua domain materials but not specific test items 
amount of these activities 



TYPICAL 20. Can you describe typical practices of test preparation? 

PRACTICES OF 
TEST PREPARATION 



probes: 



s * 

If they say. one school does X. ask how common this is, or how m^ .> 
other schools do the sane. 

Do schools use the mati'-ials and assistance you preside? 
What else do they do beyond what you recommend? 
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ZxZUDE 21. Can you describe extrerr-.e cases of cast preparation 

PRACTICSS OF 
TZST PREPARATION 

'''*''^If thev describe a worst case, ask vhat they would think of as a best 
case, '(as well as what is more typical, above) 
Exaraples of cases which violate your reccrjnendations? 



TEST ADMINISTRATION 22. Do you have written policies regarding test 
AND SECURITY administration and security procedures? 

POLICIES If not. do you have informal guidelines? 

request ; written policies 
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aHO 23. *:ho adainisters the tests? 

ADMINISTEllS Zo teachers in some schools have copies oi tne 
OR HAS TESTS tests prior to test administration. 

:k KNOWS 'Tr.STS 

He . iaailiar are teachers with the specific items on the 
teits? 

probes: 

teachers administering sarae test over years 
principals ^r teachers .-iaving test files 



DETECT 24. Do you have any formal procedures for detecting anomalies in 
ANOMA-IiiS the data? 

request copies 
probes: 

check for missing test booklets . ^ oraci.res 

^rtmnnrer detection of significant numbers of erasures 
computer detection o^ ^^-^^^^^^^^.y gains from one year to the 

check numbers of students tested against enrollment 
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TYPICAL AND 25. Can you give examples of both typical anc extreme tasting 
2XTRDIE practices? 

PRACTICES ^^^^ .MrhhMH score r»oorrs because of suspected cheatin:? 

^^^'^"ood practicas: consistent, successful n>ake-up testing 

examples of cheating- 

teachers filling in answers 
extending tiae limits for tests 
teachina specific items on the teat 
discrepancies in numbers of students tested 



(Ask the following only in distrixts designated as 7's or 8's- large districts] 

REACTIONS 26. Vhat are your reactions to the Cannell report and its 

TO CANNELL conclusions? 

REPORT 
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FACTORS IN 27. What do you think are tne P^^^f '° 
ACHIEVE^tENT the recent trends in achievenient test scores in /our 

trends"* district? 

probes: 

educational reforms 

norms (unrepresentative or old) 

pressure on teachers to have high scores 
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Closing: 

When finishing and thanking them for their time, review the things which you may 
have requested in writing. 



Checklist of Requested Written Information 

_ testing data on years not yet received (e.g.. all three years 1985-1985) 
testing data such as distribution measures 

#3- uame and phone of office or person with enrollment figures 

#6- Rules for testing exclusions 

jfS- Policies for tnake-up testing 

#13- Educational reforms in the state 
119. Technical assistance or materials for test preparation 

#22- Test administration and security policies 

#24- Procedures for detecting anomalies 



The address tor mailing is: 

Dr. Robert Linn 303-492-8280 (Bob) ^^124 

Universicy of C jrado 

(Nancy) or -U08 

School of Education 

(Lorrle) 

Campus Box 249 

Boulder, CO 30309 



If you have missing answers and have to schedule another call, please indicate 
that in the telephone log. 
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Appendix C 

Number of Districts Available by Cells in Sampling Design 



Region 
East 



District Size 
Less than 1,200 



SES Level 



Number of 
Districts 
Available 



1,200 to 2,499 



2,500 to 4,999 



5,000 to 9,999 



10,000 to 24,999 



25,000 to 49,999 



50,000 to 99,999 



100,000 or more 



Low 


5 


Below Average 


5 


Average 


5 


Above Average 


5 


High 


mm 

5 


Low 


5 


Below Average 


5 


Average 


D 


Above Average 


5 


High 


5 


Low 


5 


Below Average 


5 


Average 


5 

mm 


Above Average 


5 


High 


5 

gm 


Low 


5 


Below Average 


5 


Average 


5 


Above Average 


5 


High 


5 




5 


Below Average 


mm 

5 


Average 


5 


Above Average 


5 


High 


mm 

5 


Low 


2 


Below Average 


4 


Average 


0 


Above Average 


1 


High 


1 


Low 


1 


Below Average 


2 


Average 


1 


Above Average 


1 


High 


0 


Low 


1 


Below Average 


2 


Average 


0 


Above Average 


2 


High 


1 
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Nuniber of 
Districts 



Region 



District Size 
Less than 1,200 



cpc Tjavol 


Avail a! 


LOW 


5 




5 


Average 


5 




5 




5 


LiOw 


5 




5 


Average 


5 




5 




5 


liOW 


5 




5 


Average 


5 


ADwVe /ivd.ayc 


5 


niy n 


5 


Low 


5 




5 


AVerayG 


5 


ADOve i\vei7ay« 


5 


nly n 


5 


Low 


1 




5 


Averay e 


5 


/VDwVc /\V*5JtaVij4^ 


5 


n xy 11 


5 


LOW 


0 


OCS X w W 9\ V ^ *• \* *j w 


2 


AVexay e 


5 


ADOve Mvtsxayc 


5 




4 


Low 






3 


Average 


2 


Above Average 


0 


High 


0 


Low 


0 


Below Average 


1 


Average 


1 


Above Average 


0 


High 


c 



North/ 
Central 



1,200 to 2,499 



2,500 to 4,999 



5,000 to 9,999 



10,000 to 24,999 



25,000 to 49,999 



50,000 to 99,999 



100,000 or more 
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Number of 
Districts 

Region District Size SES Level Available 



South Less than 1,200 Low 5 

Belnw Average 5 

Average 5 

Above Average 2 

High 3 

1,200 to 2,499 Low 5 

Below Average 5 

Average 5 

Above Average 2 

High 0 

2,500 to 4,999 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 5 

5,000 to 9,999 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 3 

10,000 to 24,999 Low 5 

Below Average 5 

Average 5 

Above Average 5 

High 4 

25,000 to 49,999 Low 2 

Below Average 3 

Average 5 

Above Average 5 

High 2 

50,000 to 99,999 Low 1 

Below Average 3 

Average 5 

Above Average 5 

High 1 

100,000 or more Low 0 

Below Average 1 

Average 5 

Above Average 0 

High 1 
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Number of 
Districts 



Region 
West 



District Size 
Less than 1,200 



SES Level 


Availal 


Low 


5 


Below Average 


5 


Average 


5 


Above Average 


5 


High 


5 


Low 


5 


Below Average 


5 


Average 


5 


Above Average 


5 


High 


5 


Low 


5 


Below Average 


5 


Average 


5 


Above Average 


5 


High 


5 


Low 


5 


Below Average 


5 


Average 


5 


Above Average 


5 


High 


5 


Low 


5 


Below Average 


5 


Average 


5 


Above Average 


5 


High 


5 


Low 


2 


Below Average 


2 


Average 


5 


Abov3 Average 


5 


High 


5 


Low 


1 


Below Average 


1 


Average 


5 


Above Average 


5 


High 


1 


Low 


0 


Below Average 


0 


Average 


3 


Above Average 


1 


High 


0 



1,200 to 2,499 



2,500 to 4,999 



5,000 to 9,999 



10,000 to 24,999 



25,000 to 49,999 



50,000 to 99,999 



100, 000 or more 



August 18, 1988 




Dear fnl it n^r.-NOT ON nt-^ffTWt- ): 

• 4« « *Midv that is being conducted by the Center foi: 

rnt««r.ro^Rt a^im^n, those concerned .bout the a«e„,«ent of educational 
achievement . 

Cannell'a findings and conclusions ate both provocative and controversial. 
B«ed on his suriery of states and selected school districts, Cannell 
concluded that "staLardired, nationally normed achievement tests g.ve 
chuiren, parents, school systems, legislatures, and the press misU.d.n, 
repirtT;n^chiev;ment levels" (p. 6 of special issue of Fdl . r.n.n .l 
-nr. TIT— P-«'-rire. 1968, Vol. 7, No. 2). 

Oiven the importance that is "t.ched to .•/;^™'a:S''con:iusion,. 
.iaespread use Of no^auv. co^' ' s information 

to and explain the findings. 

^o achieve these goals, we need your help in collecting information from a 
^4oninv reoresentative sample of school districts that will provide a 

er iata ba e for detenninLg not only what level of student Performance is 
better ^. ^,3„ a„d interpretations that are being made of the 

re^u^tr'Talso are L^htng information about factors that may influence 



test results 
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Your district has been selected as part o£ a nationally representative sample 
for this study. Hence, your participation is critical to maintaining 
representativeness and drawing conclusions about achievement testing for the 
nation, pr-"'*-- wni nnr ^«>r^rtpn fnr inrtividnni trhnftl riinrncta . 
H P^^»o^. p^r f irlr^^^-" ""^^ ^^^mnifd dinrrict ia^ fnenrinl rn ftnaunnq a n 

We ask that you complete the euclosed questionnaire about your district's 
testing progiam. In many cases, the information that we are seeking on the 
forms may be provided in reports that have previously been P'^ep-red. If so, 
we^quest that you answer the general questionnaire items and send us the 
questionnaire along with copies of any reports that give results of 
districtwide assessments of student achievenent or su^ries of district 
results that have been published within the past three years "^^^ 
those reports to obtain the requested information. Copies of press releases 
and newspaper articles about the test results would also be useful. 

Please return the completed questionnaire in the enclosed envelope to: 

Robert L. Linn 
School of education 
Campus Box 249 
University of Colorado 
Boulder, CO 80309-0249 

we also ask you to participate in a telephone interview which 
additional questions about testing policies and practices. In order to 
schedule an^nterview, we ask that you indicate on the ^-estionna re dates and 
times which would be convenient for one of our staff members to call The 
interviews consist of fifteen questions about your testing program and usua.ly 
last about 30 minutes. 

Thank you for your consideration. We realize that school districts receive 
lan^ requests for information and that responding to such requests -s a burden 
oHour^ime. Your willingness to help is essential to the success of the 
study and to our ability to provide solid answers to the important educational 
questions that were raised by the Cannell report. 

Sincer .ly. 



. Robert L. Linn 

Eva L tjaKer University of Colorado-Boulder 

Co-Directors, Center Tor Research on Evaluation, Standards, and 
Student Testing 
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August 18, 1988 



Dear ^nT^T-^^TT^^y n,^,.NnT on nF.SKTnPl. 

we seek your assistance in a study that i/ N.ing conducted by the Center for 
Research on Evaluation, Standards, and Student Testing (CRESST) on behalf of 
the U S Department of Education's Office o'. Educational Research and 
^^rovement^OERI) . This study was stimulated by ^^Vw^fv, ? 'a11 
N^med Elementary Achievement Testing in America's Public Schools; Hm*^ 
nm -rnrrn -^^r...^ by Dr. John a. Cannell As 

report attracted considerable attention in the press and has been of great 
interest at OERI and among those concerned about the assessment of educational 
achievement • 

cannell -s findings and conclusions are both Provocative and controversial. 
Based on his survery of states and selected school districts, Cannell 
concluded that "standardized, nationally normed achievement tests give 
Children, parents, school systems, legislatures, and th« press misleading 
;eports on achievement levels" (p. 6 of special issue of Eriwc ii t i ona l 
Mo^^„rPm«>nt ' T<^n^.^ anH Practice. 1988, vol. 7, No. 2). 

Given the importance that is attached to student achievement and the 
widespread use of normative comparisons, Cannell's findings and conclusions 
deserve close scrutiny. We need to have technically accurate ^-'^l^ ^ 
about achievement results reported by school districts across the nation^ W 
also need to have a better understanding of the factors which may contribute 
to and explain the findings. 

TO .chieve these goals, we need your help in collecting infom>«ion from . 
latlon^ny representative sample of school districts that -111 provide . 
^t«r data base for determining not only -hat level of student Pe^'""-"" 
b^ing reported, but the use, and interpretations that are being 
«su?t^ we also are seeking information about factors that may influence 
test results. 

Your district has been selected as part of a nationally represent.t l--e sample 
for this study. Hence, your participation Is critical to maintairms 
representativeness and dra-lng conclusions about achievement testing for the 

o S J 



We 
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„ ^^^r i rr r r r» rr ^-^r - ^ ^"-^ "'^'""^'^^ rii^rrirr in ftnaenrim rn *^nnnrinq a n 

fl P^.iratA p 1r^"^*> nation fti fl who l e . 

we ask that you complete the enclosed questicnaite about your district's 
testing program, ik many cases, the information that we are seeking on the 
forits Ly provided in reports that have previously been prepared. If so, 
we^q^st that you answer the general questionnaire items and send us the 
questionnaire along with copies of any reports that give ^•'"^^ 
districtwide assessments of student achievement or summaries of district 
results that have been published within the past thre« years ^^1,^^^,, 
^hose reports to obtain the requested information. Copies of press releases 
and newspaper articles about the test results would also be useful. 

Please return the completed questionnaire in the enclosed envelope to: 

Robert L. Linn 
School of Education 
Campus Box 249 
University of Colorado 
Boulder, CO 80309-0249 

Thank you for your consideration. We realize that school districts receive 
lan^re^es^s ^or information and that responding to such requests ^» / ^urden 
Tn Jour^ime. Your willingness to help is essential to the 

study and to our ability to provide solid answers to the impo-tant educational 
questions that were raised by the Cannell report. 

Sincerely, 



r- T n,v*»r Robert L. Linn 

UCLA ' University of Colorado-Boulder 

Co-Directora, Center for Research on Evaluation, Standards, and 
Student Testing 



District Testing Information 



District Name 



Person Supplying Information 



State 



Acklress 



Phone Number 



Title 
8 9 
Testing] Type ol 






















198S-1986 


















1986*1967 


7 
















1987-1988 




































1985-1986 


















1986-1987 


8 
















1987-1988 




































198S-1986 


















1986-1987 


9 
















1987-1988 




































198S-1986 


















1986-1987 


1 0 
















1987-1988 




































1985-1986 


















1986-1987 


\ 1 
















1987-1988 




































1985-1986 


















1986-1987 


1 2 
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Plw Refer to Explanation of Information Requested - Attached 



10 



1 1 



12 



T 3 



1 4 



^umber of Students 
Enrolled 



Number of Students 
Tested 



Number of Students' 
Scores Reported 



Reading: % of Students 
ahovfl National 50%ile 



Math: % of Students 
above National 50%ile 
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8-11. Please indicate below the name of the test used at eacji orade level 

include edition and form), the number of students tested. AND THE PERCENT OF STUDENTS ABOVE THE 
NATONAL 50TH P^^^^^ (If the percent of students above the national 50th percentile .s not available, 

please provide as much of the information on pages 4 and 5 as possible.) 



Testing Y«ar 



1985-1986 



1986-1987 



1987-1986 



198S-1986 
19C6-1987 



1987-1988 



1985-1986 



1986- 1987 

1987- 1988 



1985-1986 



1986-1987 



1987-1988 



J. 

Grade I Test Name. Edition 

and Form 



K 



1985- 1986 

1986- 1987 



1987-1988 



1985-1986 



1986-1987 



1987-1988 



1985-1966 



1986- 1987 

1987- 1988 



Number o( Students 
Tested 



10. I li 

Reading: % of Students I Math: % o( Students 

above National 50%iie I above National 50%\\» 
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resting Year | Grade | Test Name, Edition 

and Form 



Number of Students 
Tested 



Reading: % of Students 
above National 50%ile 



Math: % of Students 
above National SO%ile 



1985-1986 



1986- 1987 

1987- 1988 



1985-1986 



1'^86-1987 I 8 



1987-1988 



1985- 1966 

1986- 1987 

1987- 1988 

1985-1986 



1986-1987 I 10 



1987-1988 



1 



985-1986 



1986- 1987 

1987- 1968 



1 1 



1985- rsre6 

1986- 1967 

1987- 1988 



12 



12. Testing Dates (month/year) 



13. Norming year of nom referenced te5t(s) used; 



14. Year these tests were first used in your district 
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4 

If the percent of students above the national 50th percentile Is provided on pages 2 and 3. pages 4 and 
5 need not be completed. £kip to page 6. 

If tlie number of students above the national SOth percentile (columns 10 and 
II, pages 2-3) Is nM. known, please provide as much of the following 
information as possible. 



Testing Year 


Grade 


Reading 

Standard 
Mean Deviation 


Standard 
Mean Deviation 


Readii 
at eac 
25 


ig Score 
;h perce 
50 


ntile 
75 


Malh 
at eac 
25 


Score 

;h percenttie 
50 75 


1985-1986 








1 
















1986-1987 


K 






















1987*1988 
















































1985-1986 
























1986-1987 


1 






















1987-1988 
















































1985-1986 
























1986-1987 


2 






















1987-1988 
















































1985-1986 
























1986-1987 


3 






















1987-1988 
















































1985-1986 
























1986-1987 


4 






















1987-1988 
















































1985-1986 














— 










1986-1987 


5 






















1987-198b 
















































1985-1986 
























1986-1987 


6 






















1987-1988 
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Alternate Information Available 
District Test Results 



*^ ft 



District Name 



Person Supplying Information 



State 



Address 



Phone Numt)er 



Title 

20 



Testing Year 

1985- 1986 

1986- 1967 

1987- 1986 



Grade 



1985-1966 



1986- 1967 

1987- 1988 

1985-1966 



1986-1967 



1987-1968 



1 985- 1986 

1986- 1987 



1967-1968 

1985-1966 
1966-1967 
1987-1968 

1985-1966 
1966-1987 
1987-1966 



15 



16 



Reading 



Mean 



Standard 
Deviation 



17 



18 



19 



Math 



Mean 



Standard 
Deviation 



Reading Score 
at each percentile 
25 50 75 



Math Score 
at each percentile 
25 SO 75 



1985- 1986 

1986- 1967 



1987-1988 
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Explanation of Information Requested 



Column Information requested 

1 Testing year 

2 Grade levels tested K - 12. 

3 Name of icsi used e.g., CTBS. MAT. name of locally developed test. 

4 Edition of the test used at each grade level, e.g.. 1982. 

5 Form of the test used at each grade level. 

6 Year when test was first used. 

7 Norming year of test used for reporting scores. 

8 Month in which tests were administered. 

9 Type of scores reported, e.g.. percent correct, pcrceniilc rank. NCE. 

n.b. If you have more than one type of score, please provide one form 
of data in the preferred order as follows: 

Percentile Rank 
Grade Equivalents 
NCE 

Stanincs 
Percent Correct 

• 

10 Number of students enrolled: the total number of students enrolled by 
grade 

1 1 Number of students tested at each grade 

12 Number of students' scores reported: I*" not ail scores arc used to 
compute rankings or other statewide test results, enter the number of 
students' scores used to compute the achievement data. 

^ ^rartin p %: The nerccP l ftf ^i ^ f l^ P'^ scoring above the national .^Olh 

pf rrentile. 

|4 Math %: Thf- perceni nf students scoring abovC the pa i iOP.'^l f'^\ h 

pf rcentile. 

n b If neither reading nor math data reoucmed, i n \2 and 13 arg avaiiaMc . plp^?^g 
pr ^yidf. the most appronriate rnm positc scores and indicate the -nature pf ihcFC 

pn the form. 



(1 ^ 



ERIC 



If ihc data requested in columns 13 or 14 (percent of students scoring above the 
national 50lh percentile) are not available, please provide as nuch of the following 
as possible (columns 15 - 20 on the Alternate Information Sheet): 

Column 

15 Reading mean for the district. 

16 Reading standard deviation. 

17 Math mean. 

18 Math standard deviation. 

19 Reading score at each percentile: The score 

- at the 25th percentile districtwide 

- at the 50th percentile districtwide. 

- at the 75th percentile districtwide. 

20 Maih score at each percentile: The math score 

•at the 25ih percentile districtwide. 

- at the 50th percentile districtwide. 

- at the 75ih percentile districtwide. 



Type of scores: If the type of scores reponcd in columns 13-20 arc not 
the same as those indicated in column 9, please indicate the type of 
scores used to compute the percentiles, mean, and standard deviations. 
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District Subsample for Telephone Interviews 
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E.2 



Appendix E (Continued, page 2 of 2) 

Using the total sample code RZS where 
R = region (1 - East, 2 » North/Central, 3 « South, 

and 4 « West) ; 
2 « size (1 » less than 1,200, 2 « 1,200-2,499, 3 « 

2,500-4,999, 4 « 5,000-9,999, 5 « 10,000- 

24,999, 6 - 25,000-49,999, 7 « 50,000-99,999, 

and 8 « 100,000 or more); and 
S » SES (1 - low, 2 « below average, 3 « average, 4 

« above average, and 5 « high), 
the following interview subsample was selected. 

411 
415 
423 
432 
433 
445 
454 
462 
463 
471 
474 
474 

481 (void) 
483 
484 



112 


211 


312 


123 


213 


323 


124 


225 


324 


131 


233 


332 


134 


242 


335 


145 


245 


343 


153 


251 


353 


155 


255 


362 


161 


263 


365 


172 


272 


371 


173 


273 


373 


174 


275(Void) 


374 


181 


282 


382 


183 (void) 


283 


383 


184 


285 (void) 


385 
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SES 




1 


1 




1 


3 




1 


4 




2 


1 




2 


2 




2 


4 




2 


5 




3 


1 






2 




3 


3 




3 


4 




3 


5 






2 




4 


3 




4 


4 




4 


5 




5 


3 




5 


4 




5 


5 




6 


1 




6 


2 




6 


4 




6 


5 




7 


1 




7 


2 




7 


2 




7 


3 




7 


4 




8 


1 




8 


2 




8 


2 




8 


3 




8 


4 




8 


5 


2 


1 


1 


2 


1 


3 


2 


1 


4 


2 


1 


5 


2 


2 


4 


2 


2 


5 


2 


3 


1 


2 


3 


2 


2 


3 


3 


2 


3 


4 


2 


4 


1 
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Appendix F (page 1 of 4) 
Grades Tested by Districts Returning Data 

Grade 



+ 
+ 



1 


2 


3 


4 


c 
D 


o 


n 


8 


9 


10 


+ 


+ 


+ 






1 










+ 


+ 


+ 


+ 








1 










+ 


+ 




1 










+ 


+ 


+ 










T 




+ 




+ 












4. 




4- 




+ 
















+ 






+ 


+ 
















+ 


+ 


+ 


It 




* 








+ 


+ 


+ 






4. 












+ 


+ 


+ 






























+ 






+ 


+ 




+ 












+ 


+ 


+ 




+ 




+ 






+ 






+ 




+ 










+ 


+ 


+ 


+ 




+ 


+ 












+ 




+ 


+ 


































+ 



















Interview completed - no normed test results 
+ + + + + + + "^"*""*" 



4- + + + 



+ + 
+ + 

4- + + + + + 



+ 

+ + 



+ + + 
+ + + + + 



+ + + + + 
+ + + + + 



+ 



+ + + + + + 
+ + + 



+ 



+ 



+ + 



+ + + + + + + + -^'* 
+ + + + + ^- + + + + 
+ ->- + + + + + + -^ 



+ + 



+ 



+ 

+ + 

+ + + + + 

+ + + + + + 

+ + + + + + 



+ + 



+ + 

+ + + + + + 

+ + + + + 

+ + + + + + 



+ + + + + + + 

+ + + 4- + + + 



+ 4- + 



4- + + 
.y 4- + + + 

Questionnaire completed no usable test results 

+ 4- 4- + + + 
. + + + + + + * + ^ + 
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Appendix F (page 2 of 4) 

Grade 



Region Size SES K 1 2 3 4 



8 9 10 11 12 



2 4 2 
2 4 3 
2 4 4 



4- + + + + 
4. + + + + + 



2 5 2 +^. + + + + 

2 5 3 + + + + + 

2 5 4 criterion Referenced Test results only 

ill + + + + 



+ + + + 
+ + 



III ^ 4 ^ ^ ^ ^ ^ ^ ^ * 

263 ^. + 4. + + 



2 6 4 

2 6 5 

2 6 5 

2 7 1 + + + + + + 

2 7 2 + + + -»- + "( 
2 



+ + + + + + 
+ + + 



+ + + 

+ + + + + + + 
+ + + + + + + 



7 2 4. + -}- + + + 



+ + + + + + 



+ + + + + + + 
+ + 



2 7 2 + + + + 

2 7 3 + + *** 

^ + + + + + + + 

+ + + 

+ + + + + + 

+ + + + + + 



2 7 3 
2 8 2 + 
2 8 3 + + 



3 11+ + 

3 1 2 + + + + + -^"^"*""*" 
3 1 4 + + + + "^"^ 



+ + + + + + 
+ + + + + 



3 15 + + + + 

+ + + + + + + 



3 2 1 + + + + 



3 2 2 + + + 



+ + + + + + 



3 2 3 + + + + 



+ + + + + + 



3 2 4 + + + + + "*""^ 
3 



3 1 + + + + + + -^ 

3 3 2 + + + 

3 3 3 + + + 

3 3 4 + + 



+ + + + + + 
+ + + + + + 



+ + + + + + 
+ + + + + 
+ + 



3 5 + + + + 

4 1 + + + * 



3 

3 4 2 + + + + + + + + 



+ + + + + + 
+ + + + + + + 

3 4 5 + + + + + + + -*- 



3 4 3 + + + 
3 4 4 + + + 



3 5 1 



+ + + 

+ + + 



^3^4 .:.: + + + + --: 
'355 + ^ ^ + + + + + + * * 



3 6 1 

3 6 2 

3 6 3 

3 6 4 

3 6 5 



+ + + 

+ + + + 
+ + + 



+ + 

+ + 
+ + + 
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Appendix F (page 3 of 4) 

Grade 



Region Size SES K 1 2 3 4 



8 9 10 11 12 



571 4. + + + + 

3'"*- ^ , ^ J. 4.+ + 

3 7 2 ^ 

3 7 2 + + + + 

3 7 3 + 

3 7 3 + + + + 

3 7 3 + + + 

3 7 3 + + + ■ 



I ] 2 . : : * ^ ^ * ^ ^ ^ * 

3 7 4 



+ + + + 

+ + + + 

+ + + + + + 

+ + + + + 

+ + + 

+ + + + 

+ + + + + 

+ + + + + 

+ + 



+ 



+ + + + 
+ + + + + + 



3 7 5 + + + 

3 I I ^ ^ ^ - ^ ^ ' ^ ' : : : 

3 8 3 . ^ ^ - ; - : ^ : ^ ^ : 

* o ^ + + ^ 



8 5 
11 + + + 



+ + + + + 
+ + + + + 



1 2 + + + 

: 3 + + + + + + + + 

2 2 ^ ^ ^ ^ ^ ^ * ^ 



+ + + 

+ + + 

+ + + + + + 

3 ! + + + + + + 

3 2 ^ ^ ^ ^ ^ ^ ^ . t 



2 4 + 
2 4 
2 5 



3 + + + + + 



J 4 

3 5 + + 



+ + + + + 
+ + 



+ 

+ + + + + 

+ + + + 



J 4 + + + + + + + 

! * : : :::::: ^ * 

5 3 only Chapter I test data provided 

+ + + + + 

+ + + 4- + + + 

+ + + + + + + + + 



5 4 

5 5 

6 1 



+ + + 



? I ^ ^ ********* * 
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Appendix F (page 4 of 4) 

Grade 



Region Size r,ES K 1 2 3 4 5 6 7 8 9 10 11 12 



7 3 + + + 

7 3 + + + + + + + + + + + + + 

7 3 + + + + + + + + + + + + 

7 3 Criterion Referenced Test results? only 

7 4 + + + + + + + + + -*- + + + 

74 ++ + + + + + 
7 4 + + + + + 
7 4 + + + + + + + + + + + + 

7 4 + + + + + + -f + + + + + 

75 ++ ++ + 

8 3 + + + + + + + + 

8 3 + + + + 4- + + + + + + + 

8 3 + + + + + + + f + + + + + 

8 4 + + + + 



Totals 153 43 40 111 123 123 123 118 104 120 82 74 66 26 
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Appendix G 

Slem-and-Leaf Distributions of District Reports of the Percentage of Students Scoring 
Above the National Mec!ian in Reading and Mathematics 
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Appendix G 
Figure G-2 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 2 



Reading 



Stem Leaf Count 



9 : 




0 


9 : 


12 


2 


8 : 


577 


3 


8 : 


0012 


4 


7 : 


5799 


4 


7 : 


. 12 


2 


6 : 


: 555688899 


9 


6 : 


1 0012344 


7 


5 I 


: 56677788999 


11 


5 : 


: 0122334444 


10 


4 


: 557778899 


9 


4 


: 111123344 


9 


3 


: 999 


3 


3 


: 1 


1 


2 


: 99 


2 


2 


: 2 


1 


1 




0 


1 




0 



P90 = 80 
P75 = 68 
P50 « 57 
P25 » 47 
PIO = 41 



Mathematics 



Stem Leaf Count 



9 J 


559 


3 


9 : 


013 


3 


8 : 


67 


2 


8 : 


001334 


6 


7 : 


. 5779 


4 


7 ! 


. 0001112222344 


13 


6 : 


; 55566788889 


11 


6 : 


: 000011222 


9 


5 : 


: 56677889 


8 


5 ; 


: 001124 


6 


4 


: 568 


3 


4 


: 23 


2 


3 


: 6 


1 


3 


: 4 


1 


2 




0 


2 




0 


1 


: 68 


2 


1 




0 



'^90 = 86 
P75 = 74 
P50 = 67 
P25 = 57 
PIO = 46 



ii: 

ERIC 



G-3 



Appendix G 
Figure G-3 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 3 
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Appendix G 
Figure G-4 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 4 
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Appendix G 



Figure G-5 

stem-and-Leaf Distribution of the District Percents of students 
Scoring Above the National Median at Grade 5 



Reading 



Stem Leaf Count 



Q * 






0 


9 : 




03 


2 


8 : 




5 


1 


8 : 




00112333 


8 


7 : 




55578 


5 


7 




0011223344 


10 


6 




5699 


4 


6 




00112224 


8 


5 




666667788 


9 


5 




0001122233 


10 


4 


• 
• 


567888999 


9 


4 


• 
• 


11244 


5 


3 


• 
• 


55567799 


8 


3 


• 
• 


02334 


5 


2 


• 

• 


679 


3 


2 


• 
• 




0 


1 


• 
• 


9 


1 


1 


• 
• 




0 


P90 


= 80 




P75 


= 72 




P50 


« 56 




P25 


= 45 





PIO - 34 



Mathematics 



Stem Leaf Count 



9 : 




6 


1 


9 : 




0013 


4 


8 : 




Q 


1 


8 : 




\J\J A ^ m9 H 


6 


7 : 




KC777Q QQ 


8 


7 : 






5 


6 




66677778888899 


14 


6 




111122344444 


12 


5 




556677899 


9 


5 




002222244 


9 


4 


• 

• 


5667888899 


10 


4 


• 

• 


1344 


4 


3 


« 
• 


57 


2 


3 


• 

• 


2 


1 


2 


• 
• 




0 


2 


• 

• 


2 


1 


1 


• 
• 




0 


1 


« 

• 




0 


P90 


a 82 




P75 


= 73 




P50 


64 




P25 


- 52 




PIO 


» 45 





114 



Appendix G 
Figure G-6 

Stero-and-Leaf Distribution of the District Percents of students 
Scoring Above the National Median at Grade 6 
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Appendix G 
Figure G-7 



Stem~and-Leaf Distribution of the District Percents of Students 
stem ana ^^^^.^^ ^^^^ National Median at Grade 7 
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Figure G-8 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 8 
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Figure G-9 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 9 
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Appendix G 
Figure G-lO 

Stem-and'-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade IC 
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Figure G-ll 

Stem-and-Leaf Distribution of th?> District Percents of Students 
Scoring Above the National Median at Grade 11 
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Appendix G 
Figure G-12 

Stem-and-Leaf Distribution of the District Percents of Students 
Scoring Above the National Median at Grade 12 
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